Pythonic String Testing
Solution 1:
You can start by simplifying content_test():
defcontent_test(term):
    returnany(c.isalpha() for c in term)
In fact, that's simple enough that you don't really need a separate function for it anymore.
What I'd do in this case is write a generator that yields only valid terms from the file. Then just convert that to a list using the list() constructor. This way you can read just a line at a time, which will save you a good bit of memory if the files are large.
defread_valid_terms(filename):
    withopen(filename) as f:
        for line in f:
            for term in line.split():
                ifany(c.isalpha() for c in term):
                    yield term
terms = list(read_valid_terms("terms.txt"))
Or if you are just going to iterate over the terms anyway, and only once, then just do that directly rather than making a list:
for term in read_valid_terms("terms.txt"):
    print term,
printSolution 2:
In Python, string objects already contain a method that does that for you:
>>> "abc".isalpha()
True>>> "abc22".isalpha()
FalseSolution 3:
While you could use a regular expression, a pythonic way would be to use any:
import string
defcontent_test(term):
    returnany((c in string.ascii_lowercase) for c in term)
If you also want to allow upper-case and locale-dependent characters, you can use str.isalpha.
A couple of additional notes:
- FileReadshould inherit from- object, to make sure it's a new-style class.
- Instead of writing if content_test(term) is False:, you can simply writeif not content_test(term):.
- cleancan be written a lot, ahem, cleaner, by using- filter:
defclean(self):
    self.terms = filter(content_test, self.terms)
- You're not closing the file f, and may therefore leak the handle. Use thewithstatement to automatically close it, like this:
withopen(filename, 'r') as f:
    content = f.read()
    self.terms = content.split()
Solution 4:
Using regular expressions:
import re
# Match any number of non-whitespace characters, with an alpha char in it.
terms = re.findall('\S*[a-zA-Z]\S*', content)
Post a Comment for "Pythonic String Testing"