Abstract: As we continue to be flooded with increasing amounts of text data, we have growing need for tools that will not only allow us to retrieve documents, but also mine the structured data buried inside their natural language text. Such structured representation then enables automated methods to model outliers, predict the future, and provide decision support. For example, U.S. ports would be safer if automated processes could detect suspicious patterns in the shipping manifests of cargo container ships; a system could suggest which DNA array experiments may be most fruitful by mining facts from biology research articles; we could better predict long-term weather trends by building a weather database from large collections of thousand-year-old Chinese diary entries. Information extraction is the process of filling a structured database from unstructured text. It is a difficult statistical and computational problem often involving hundreds of thousands of variables, complex algorithms, and noisy and sparse data. In this talk I will briefly review previous work in finite state, conditionally-trained Markov random field models for information extraction, and then describe three pieces of recent work: (1) the application of conditional Markov random fields to extraction of tables from government reports, (2) feature induction for these models, applied to named entity extraction, (3) a new, random field method for noun co-reference resolution that has strong ties to graph partitioning.