arrays - Playing with Hashes from a FTP flow in Perl -
ok, i'm having issues understanding how work hashes. long story short, i'm attempting parse through ftp log , find relevant flows specific search criteria. i'm trying make is, have ip address or user name, first pretty simple grep try minimize data don't need , send output external file. if i'm searching username testing1, grep on testing1 , sends output file called output.txt:
dec 2 00:14:09 ftp1 ftpd[743]: user testing1 dec 2 00:14:09 ftp1 ftpd[743]: ftp login 192.168.0.2 [192.168.0.2], testing1 dec 2 00:30:08 ftp1 ftpd[1261]: user testing1 dec 2 00:30:09 ftp1 ftpd[1261]: ftp login 192.168.0.4 [192.168.0.4], testing1 dec 2 01:12:33 ftp1 ftpd[11804]: user testing1 dec 2 01:12:33 ftp1 ftpd[11804]: ftp login 192.168.0.2 [192.168.0.2], testing1
and below example of originating log data:
dec 1 23:59:03 ftp1 ftpd[4152]: user testing1 dec 1 23:59:03 ftp1 ftpd[4152]: pass password dec 1 23:59:03 ftp1 ftpd[4152]: ftp login 192.168.0.02 [192.168.0.2], testing1 dec 1 23:59:03 ftp1 ftpd[4152]: pwd dec 1 23:59:03 ftp1 ftpd[4152]: cwd /test/data/ dec 1 23:59:03 ftp1 ftpd[4152]: type image
i go in, put processids find along time of id , put them hash. see below:
$var1 = { '743' => [ '00:1' ], '20687' => [ '01:3' ], '27186' => [ '15:3' ], '6929' => [ '12:0' ], '24771' => [ '09:0' ], '11804' => [ '01:1' ], '27683' => [ '08:3' ], '14976' => [ '04:3' ], };
it looks if time being put hash array. unable figure out why happening decided work array. following how hash of arrays created:
# ------------------------------------------------------- # extract pids , time lines, take out doubles # ------------------------------------------------------- $infile3 = 'output.txt'; %pids; $found; $var; open (input2, $infile3) or die "couldn't read $infile3.\n"; while (my $line = <input2>) { if($line =~ /(\d{2})\:(\d)/ ) { $hhmm = $1 . ":" . $2; if ($line =~ /ftpd\[(.*?)\]/) { $found = 0; foreach $var(keys %pids){ if(grep $1 =~ $var, keys %pids){ $found = 1; } } if ($found == 0){ push @{$pids{$1}}, $hhmm; } } } }
to speed things have decided read lines have matching pids, whether fit flow or not, array don't have keep reading in originating file.
##------------------------------------------------------- ## read each line file array ##------------------------------------------------------- open (input, $infile2) or die "couldn't read $infile2.\n"; @messages; while (my $line = <input>){ # if there match pid put line in array if ($line =~ /ftpd\[(.*?)\]/){ $mpid = $1; foreach $key (keys %pids){ if ($key =~ $mpid){ push @messages, $line; } } } }
i'm trying match line pid , time flow. i'm matching hh:m in time more of chance entire flow , because chances of other flows pid having same timeframe pretty slim. these results send internal web page.
# ------------------------------------------------------- #find flow based on pid found criteria #------------------------------------------------------- foreach $line(@messages){ if(my($pid) = $line =~ m{ \[ \s*(\d+) \]: }x) { if($line =~ /(\d{2})\:(\d)/){ $time = $1 . ":" . $2; if ($pids{$pid}[0] =~ /$time/){ push $pids{$pid}[0], $line; } } } }
right above code reason deleting time hash once matched. unsure why happening.
i able working bash script took decades complete. suggestions people here have decided tackle perl taking crash course. i've read can , have basic programming skills in c++ still need lot of work. got working using arrays once again incredibly slow , getting lot of flows matched process id not flows looking for. after further suggestions decided work hashes, have process id key, have specific time referenced key, , lines within log have both key , time flow. have had multiple questions on have a. not explained myself , b. have been trying different things learn. record here has helped me tremendously , hope 1 day can same others on list. reason can't stuff through thick skull.
anyways, covered everything, i'm sure i'm starting on people's nerves these questions apologize.
update:
well think figured out how make hashes doesn't right. changed push @{$pids{$1}}, $hhmm;
$pids{$1}{$x} = $hhmm;
creates following:
$var1 = { '743' => { '' => '00:1' }, '20687' => { '' => '01:3' },
but doesn't it's referencing correctly when print $pids{743};
prints hash(0x4caf10)
update:
ok, able put values hashes changing @{$pids{$1}}, $hhmm;
$pids{$1} = $hhmm;
seems working:
$var1 = { '743' => '00:1', '20687' => '01:3', };
but how check see if value '00:1' matches variable? have , not working:
if($pids{$pid} == qr/$time/){ $pids{$pid}{$time}[$y] = $line; $y++; };
this how should after match made:
$var1 = { '743' => '00:1', '4771' => { '23:5' => [ 'dec 1 23:59:23 ftp1 ftpd[4771]: user test ', 'dec 1 23:59:23 ftp1 ftpd[4771]: pass password ', 'dec 1 23:59:23 ftp1 ftpd[4771]: ftp login 192.168.0.2 [192.168.0.2], test ', 'dec 1 23:59:23 ftp1 ftpd[4771]: cwd /home/test/ ', 'dec 1 23:59:23 ftp1 ftpd[4771]: type image ', 'dec 1 23:59:23 ftp1 ftpd[4771]: pasv ', 'dec 1 23:59:23 ftp1 ftpd[4771]: retr test ', 'dec 1 23:59:23 ftp1 ftpd[4771]: quit ', 'dec 1 23:59:23 ftp1 ftpd[4771]: ftp session closed ' ] },
you have couple of errors in code.
the first you're pulling out 1 digit of minutes:
if($line =~ /(\d{2})\:(\d)/ ) {
should be
if($line =~ /(\d{2})\:(\d{2})/ ) {
if i'm interpreting intent of code correctly, you're trying find out whether you've seen time given pid set first time. if so, don't need loop through keys in %pid this. need is
if ($line =~ /ftpd\[(.*?)\]/) { $pid{$1}[0] = $hhmm unless exists $pid{$1}; }
notice i'm doing assignment rather push, wind time in first element of array reference.
i think may have meant type "==" instead of "=~" below:
if(grep $1 =~ $var, keys %pids){
presumably need capture more information time, such user name, transfer type, etc. may find better use hash reference instead of array reference under pid. way can tag , find information:
$pid = $1; if ($line =~ /ftpd\[(.*?)\]/) { $pid{$pid}{time} = $hhmm unless exists $pid{$pid}; } if ($line =~ /user (\w+)/) { $pid{$pid}{user} = $1; }
of course, you'll want index according whatever makes sense purposes make searches fast. instance, might keep second hash indexed time:
$time{$hhmm}{pid} = $pid;
or keep list of pids associated given user
push @{$user{$1}}, $pid;
Comments
Post a Comment