arrays - Playing with Hashes from a FTP flow in Perl -


ok, i'm having issues understanding how work hashes. long story short, i'm attempting parse through ftp log , find relevant flows specific search criteria. i'm trying make is, have ip address or user name, first pretty simple grep try minimize data don't need , send output external file. if i'm searching username testing1, grep on testing1 , sends output file called output.txt:

dec  2 00:14:09 ftp1 ftpd[743]: user testing1 dec  2 00:14:09 ftp1 ftpd[743]: ftp login 192.168.0.2 [192.168.0.2], testing1 dec  2 00:30:08 ftp1 ftpd[1261]: user testing1 dec  2 00:30:09 ftp1 ftpd[1261]: ftp login 192.168.0.4 [192.168.0.4], testing1 dec  2 01:12:33 ftp1 ftpd[11804]: user testing1 dec  2 01:12:33 ftp1 ftpd[11804]: ftp login 192.168.0.2 [192.168.0.2], testing1 

and below example of originating log data:

dec  1 23:59:03 ftp1 ftpd[4152]: user testing1 dec  1 23:59:03 ftp1 ftpd[4152]: pass password   dec  1 23:59:03 ftp1 ftpd[4152]: ftp login 192.168.0.02 [192.168.0.2], testing1   dec  1 23:59:03 ftp1 ftpd[4152]: pwd   dec  1 23:59:03 ftp1 ftpd[4152]: cwd /test/data/   dec  1 23:59:03 ftp1 ftpd[4152]: type image 

i go in, put processids find along time of id , put them hash. see below:

$var1 = {       '743' => [                  '00:1'                ],       '20687' => [                    '01:3'                  ],       '27186' => [                    '15:3'                  ],       '6929' => [                   '12:0'                 ],       '24771' => [                    '09:0'                  ],       '11804' => [                    '01:1'                  ],       '27683' => [                    '08:3'                  ],       '14976' => [                    '04:3'                  ], }; 

it looks if time being put hash array. unable figure out why happening decided work array. following how hash of arrays created:

# ------------------------------------------------------- # extract pids , time lines, take out doubles # ------------------------------------------------------- $infile3 = 'output.txt'; %pids; $found; $var;  open (input2, $infile3) or die "couldn't read $infile3.\n";  while (my $line = <input2>) {     if($line =~ /(\d{2})\:(\d)/ ) {         $hhmm = $1 . ":" . $2;         if ($line =~ /ftpd\[(.*?)\]/) {             $found = 0;             foreach $var(keys %pids){                 if(grep $1 =~ $var, keys %pids){                     $found = 1;                 }             }             if ($found == 0){                 push @{$pids{$1}}, $hhmm;              }         }            }  } 

to speed things have decided read lines have matching pids, whether fit flow or not, array don't have keep reading in originating file.

##------------------------------------------------------- ## read each line file array ##------------------------------------------------------- open (input, $infile2) or die "couldn't read $infile2.\n";  @messages;  while (my $line = <input>){     # if there match pid put line in array     if ($line =~ /ftpd\[(.*?)\]/){         $mpid = $1;         foreach $key (keys %pids){             if ($key =~ $mpid){                 push @messages, $line;             }         }       } } 

i'm trying match line pid , time flow. i'm matching hh:m in time more of chance entire flow , because chances of other flows pid having same timeframe pretty slim. these results send internal web page.

# ------------------------------------------------------- #find flow based on pid found criteria #-------------------------------------------------------  foreach $line(@messages){     if(my($pid) = $line =~ m{ \[ \s*(\d+) \]: }x) {         if($line =~ /(\d{2})\:(\d)/){             $time = $1 . ":" . $2;             if ($pids{$pid}[0] =~ /$time/){                  push $pids{$pid}[0], $line;             }         }     } } 

right above code reason deleting time hash once matched. unsure why happening.

i able working bash script took decades complete. suggestions people here have decided tackle perl taking crash course. i've read can , have basic programming skills in c++ still need lot of work. got working using arrays once again incredibly slow , getting lot of flows matched process id not flows looking for. after further suggestions decided work hashes, have process id key, have specific time referenced key, , lines within log have both key , time flow. have had multiple questions on have a. not explained myself , b. have been trying different things learn. record here has helped me tremendously , hope 1 day can same others on list. reason can't stuff through thick skull.

anyways, covered everything, i'm sure i'm starting on people's nerves these questions apologize.

update:

well think figured out how make hashes doesn't right. changed push @{$pids{$1}}, $hhmm; $pids{$1}{$x} = $hhmm; creates following:

$var1 = {           '743' => {                      '' => '00:1'                    },           '20687' => {                        '' => '01:3'         }, 

but doesn't it's referencing correctly when print $pids{743}; prints hash(0x4caf10)

update:

ok, able put values hashes changing @{$pids{$1}}, $hhmm; $pids{$1} = $hhmm; seems working:

$var1 = {           '743' => '00:1',           '20687' => '01:3', }; 

but how check see if value '00:1' matches variable? have , not working:

if($pids{$pid} == qr/$time/){     $pids{$pid}{$time}[$y] = $line;     $y++; }; 

this how should after match made:

$var1 = {           '743' => '00:1',           '4771' => {                       '23:5' => [                                   'dec  1 23:59:23 ftp1 ftpd[4771]: user test ',                                   'dec  1 23:59:23 ftp1 ftpd[4771]: pass password ',                                   'dec  1 23:59:23 ftp1 ftpd[4771]: ftp login 192.168.0.2 [192.168.0.2], test ',                                   'dec  1 23:59:23 ftp1 ftpd[4771]: cwd /home/test/ ',                                   'dec  1 23:59:23 ftp1 ftpd[4771]: type image ',                                   'dec  1 23:59:23 ftp1 ftpd[4771]: pasv ',                                   'dec  1 23:59:23 ftp1 ftpd[4771]: retr test ',                                   'dec  1 23:59:23 ftp1 ftpd[4771]: quit ',                                   'dec  1 23:59:23 ftp1 ftpd[4771]: ftp session closed '                                 ]                     }, 

you have couple of errors in code.

the first you're pulling out 1 digit of minutes:

    if($line =~ /(\d{2})\:(\d)/ ) { 

should be

    if($line =~ /(\d{2})\:(\d{2})/ ) { 

if i'm interpreting intent of code correctly, you're trying find out whether you've seen time given pid set first time. if so, don't need loop through keys in %pid this. need is

        if ($line =~ /ftpd\[(.*?)\]/) {             $pid{$1}[0] = $hhmm unless exists $pid{$1};         } 

notice i'm doing assignment rather push, wind time in first element of array reference.

i think may have meant type "==" instead of "=~" below:

            if(grep $1 =~ $var, keys %pids){ 

presumably need capture more information time, such user name, transfer type, etc. may find better use hash reference instead of array reference under pid. way can tag , find information:

        $pid = $1;         if ($line =~ /ftpd\[(.*?)\]/) {             $pid{$pid}{time} = $hhmm unless exists $pid{$pid};         }         if ($line =~ /user (\w+)/) {             $pid{$pid}{user} = $1;         } 

of course, you'll want index according whatever makes sense purposes make searches fast. instance, might keep second hash indexed time:

           $time{$hhmm}{pid} = $pid; 

or keep list of pids associated given user

           push @{$user{$1}}, $pid; 

Comments

Popular posts from this blog

windows - Single EXE to Install Python Standalone Executable for Easy Distribution -

c# - Access objects in UserControl from MainWindow in WPF -

javascript - How to name a jQuery function to make a browser's back button work? -