Hi,
U can do like,
irb(main):001:0> a=File.open("a","r")
=> #<File:a>
irb(main):002:0> arr=a.readlines
=> ["1\n", "2\n", "3\n", "4\n", "45\n", "1\n", "23\n", "4\n", "56\n", "3\n",
"12\n", "45\n", "6\n", "34\n"]
read the lines from file and assign into an array.
irb(main):003:0> arr.length
=> 14
irb(main):004:0> arr.uniq
=> ["1\n", "2\n", "3\n", "4\n", "45\n", "23\n", "56\n", "12\n", "6\n",
"34\n"]
irb(main):005:0> arr.uniq.length
=> 10
Using the array only we can unique the values.
On Tue, Aug 24, 2010 at 5:48 PM, Ranjith <ranjith.k@xxxxxxxxxxx> wrote:
Hi all,
I have script which is used to read the apache server log file and
returns the set values. Here is the script,
require 'rubygems'
require 'apachelogregex'
require 'set'
require 'pp'
urls = Set.new
format = '%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}\"'
parser = ApacheLogRegex.new(format)
File.readlines('aleurier_access.log').collect do |line|
@line = parser.parse(line)
urls.add(@line["%h"])
# {"%r"=>"GET /blog/index.xml HTTP/1.1", "%h"=>"87.18.183.252", ... }
end
puts urls
pp @line
For eg, when I run this script it prints all host ID, what i like to do is
how to print repeated IP addresses only once and also prints the other IP
addresses which is not repeated again.
I tried uniq but it supports only array elements not reads through files. I
also tried this one it can used for small file but it doesn`t suits for
larger files.
require 'rubygems'
require 'apachelogregex'
require 'set'
require 'pp'
urls = Set.new
format = '%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}\"'
parser = ApacheLogRegex.new(format)
File.readlines('aleurier_access.log').collect do |line|
l = parser.parse(line)
urls.add(l["%h"])
# {"%r"=>"GET /blog/index.xml HTTP/1.1", "%h"=>"87.18.183.252", ... }
end
puts urls.to_a.sort
pp l
--
Cheers,
Ranjith Kumar.K,
Software Engineer,
Sedin Technologies,
http://ranjithtenz.wordpress.com/
http://victusads.com/
_______________________________________________
ILUGC Mailing List:
http://www.ae.iitm.ac.in/mailman/listinfo/ilugc