2 questions: Assn.#3greenspun.com : LUSENET : Brandeis CS114 : One Thread |
A couple more questions about Assignment 3:1) The example output is of the form [['NX', ['The/DT', 'green/JJ', 'dog/NN']], ['VX', ['was/VBD', 'eating/VBG']],.........etc.
Isn't it much easier (to program as well as manipulate) if the data was in the form [['NX', [['The', 'DT'], ['green', 'JJ'], ['dog', 'NN']]], ['VX', [['was', 'VBD'], ['eating', 'VBG']]],........etc.
Maybe i'm wrong, but it seems that taking the data out of its list format significantly reduces its usefulness. (even though it is easier to read).
2) I'm a little confused about Part 3. Is the named entity parsing meant to be a separate function from the regular parser, like the ambiguity_ratio was an additional function for the tagger? Or are we supposed to automatically get a listing of the named entitites following (or interspersed in) our parsed text, when we call parser.parse?
Thanks.
-- Anonymous, March 29, 1999
In answer to your second question, I'm assuming the name parser is a separate module altogether. Judging by the specs, it looks rather like it's not meant to be used on the same text (it's specifically for Yahoo news articles).HTH, Vivek
-- Anonymous, March 29, 1999
1.) I asked the same question about the weird formatting (I think), but I asked about the .. it's just different syntax which equate to the same thing, correct?2.) My understanding is that the 3rd part looks at the NNP's, compares them to a lexicon, and destructively replaces the 'NNP' tag with the appropriate type. (TIME, PERSON, PLACE, etc.) There are a few mis-tags in the tagged text, like the "gen." tag I mentioned previously, but overall it's not too bad.
-- Anonymous, March 29, 1999
Sorry, I forgot something:The lexicon used in #3 should be derived to specifically suit the text we are given, not _all_ newswire text! I don't see any other way to derive whether an NNP is a person, time, or place via a python function :)
Andrew
-- Anonymous, March 29, 1999
Yeah, Vivek is right. So don't follow my advice, create your own lexicon.I haven't gotten a clear answer yet, tho: Is #3 supposed to work on this article specifically or on all newswire text in general?
Andrew
-- Anonymous, March 29, 1999