[pythonvis] Re: Part 2: How to split line into words and count them

  • From: "Jeffrey Turner" <jturner522@xxxxxxxxx>
  • To: <pythonvis@xxxxxxxxxxxxx>
  • Date: Thu, 22 May 2014 13:39:20 -0400

Robert,

Use the link in the footer of this message to go to the web page for the list. 
That has a link to the list archives, which will have what you're looking for.

JDog


-----Original Message-----
From: pythonvis-bounce@xxxxxxxxxxxxx [mailto:pythonvis-bounce@xxxxxxxxxxxxx] On 
Behalf Of Robert Spangler
Sent: Thursday, May 22, 2014 12:23 PM
To: pythonvis@xxxxxxxxxxxxx
Subject: [pythonvis] Re: Part 2: How to split line into words and count them

Hello,

I accidentally deleted part 1.  Where may I find it?

Robert

On 5/22/2014 12:03 PM, Richard Dinger wrote:
> In part 1 the file was opened and read line by line.  Note there are
> print statements that can be uncommented to trace what is happening.
> Once the file is opened each line is processed.  The string object
> method split is used to split the line into a list of words at each
> whitespace location.
> wordMap is a data structure called a dictionary.  A dictionary is sort
> of a list that is accessed by a key (such as a word in this example)
> rather than by an index.  So if our text file has the word ‘the’ in it 5
> times:
> wordMap[‘the’]
> would give 5.  So I use a dictionary with the words of the text file as
> the keys to count how many of each word there are.
> Another for loop processes each word in the list.  The Words not already
> in the wordMap are added with a count of 1 and existing members are
> incremented.  The get method tries to get the count of its first
> argument and if it is not in the wordMap the second argument is
> returned.  So the statement:
> wordMap[word] = wordMap(word, 0) + 1
> Has the same result as:
> if word not in wordMap:
>    wordMap[word] = 0
> wordMap[word] = wordMap[word] + 1
> At the end of the file the result is printed out in no particular order.
> Note the if __name__ stuff at the end of the file is True when the
> script is run directly and False when imported into another script.  So
> including a section like this is a good place to put some testing code.
> So make up a file with some text in it and either name it words.txt or
> change the code to match the name.  Then run this thing.
> Now this still needs some work since capitalized words are different
> from not and punctuation appended to some words changes counts.  But we
> will look at that next version.
List web page is 
//www.freelists.org/webpage/pythonvis

To unsubscribe, send email to 
pythonvis-request@xxxxxxxxxxxxx with "unsubscribe" in the Subject field.

List web page is
//www.freelists.org/webpage/pythonvis

To unsubscribe, send email to
pythonvis-request@xxxxxxxxxxxxx with "unsubscribe" in the Subject field.

Other related posts: