Parsing unstructured location data (Twitter)

Hi folks,

I'm looking at data from the Twitter API. One of the data points I am interested in is location (country). Very few tweets are actually geo-tagged, but users have the option of entering their "location" in their user profile. Now this is a free-form field, so the formatting is very inconsistent (sometimes country, sometimes city, sometimes US-state, sometimes "in my parents' basement"). Excerpt below.
Question: Does anyone have suggestions on how to parse this type of data in order to get the COUNTRY of the user for the largest possible number of users? I can probably cobble something together with regular expressions, but does anyone know of a library or API that does this efficiently? Any ideas would be appreciated!

1 Asia Pacific
2 Australia
3 NA
4 NA
5 Europe
6 Austin, TX
7 Bali, Indonesia
8 NA
9 New York, NY
10 Königstetten
11 Austin, TX
12 New York, New York
13 London
14 Nairobi
15 San Diego, CA
16 Mexico
17 Bucharest
18 NA
19 Southern California
20 India

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.