Photo Essay - CJ 8803/4803
Joseph Jo, Sheila Isbell, Angus Perkerson, Steph Yang
Joseph Jo, Sheila Isbell, Angus Perkerson, Steph Yang
The assignment was to create a photographic essay/representation based on terms found within an article. Why people like photo essays, how they can be organized, the power of an image.
Originally, we designed the photo essay to be semi-automatic with the emphasis on key terms derived from named entity recognition tools for a set o 15 pre-determined articles.
The resulting .xml files looked like the following:
<person>Charlie Rose</person> has worked for <organization>IBM</organization> for <date> 10 years</date>
in the recognition tool, we usually use color to easily code different entity types, ie:
Charlie Rose has worked for IBM for 10 years
Where yellow is a person entity, gray an organization and green a date.
For each article, the resulting key terms were used as input to a script to extract a list of person, organization, and location entities. These lists are used to determine which key terms (the content of the entity tags) we should search on.
The resulting key terms are then used in the query to automatically search Yahoo! Images for images files containing some or all of the search terms. The resulting images would be saved in a directory for each article and linked to on the webpage, making up the photo essay.
We did not continue with this implementation for a couple of reasons:
Our final approach was to utilize additional web services offered by Yahoo, including the news search, content analysis, and the photo search APIs.
User inputs a search term for today's news. Our script searches Yahoo! News for the top ten result articles matching the search term. For each resulting article, we run Yahoo! Content Analysis on them. This Term Extraction Web Service provides a list of significant words or phrases extracted from a larger content. It takes in the content and any key terms that can be used to help aid in the extraction. The result set is all of the extracted terms in xml format.
The resulting keywords are used in Yahoo! Image Search queries. We grab the top 10 images from the image search and display them with a link to the articles the images came from along with the original article.
This result of our prototype is dynamically generated photo essays based on significant text within the articles. Each image points back to an article that may provide the user more textual input in addition to the original article. So the application does two things, collects images that may be relevant to the article and provides additional sources to the users through those resulting images.
After coding the project and testing it out, we realized that there was much room for improvement. One major drawback for our search algorithm includes the fact that we used multiple queries. This meant that we would have to query Yahoo! for summaries of the articles for the searched term, and then query Yahoo! again using the keywords to find images. This slowed the process down significantly, as each keyword (even duplicates) was sent in the search.
Although this implementation is automatic, it doesn't address a couple of issues, order of images, making sure that the essay isn't overwhelmed with the same person by doing nominal corefences or use of a thesaurus. For example, in current design Bill Clinton is a different term than Clinton, so we could end up with two images of Bill Clinton. Though this may be appropriate if the article is full of references about Bill Clinton, it may also cut off other images that may be important as well.
In addition to drawbacks of the algorithm, there were a few problems with the way the information gathered was shown. Some of which include duplicate and broken images returned from our search. Alongside, the actual interface of the program could be improved as well.
Improvements which could be made to this project include:
A Photo Essay is a pictoral representation of textual content.
From wikipedia: A photo essay is a set or series of photographs that are intended to tell a story or evoke a series of emotions in the viewer. Photo essays range from purely photographic works to photographs with captions or small notes to full text essays with a few or many accompanying photographs. Photo essays can be sequential in nature, intended to be viewed in a particular order, or they may consist of non-ordered photographs which may be viewed all at once or in an order chosen by the viewer. All photo essays are collections of photographs, but not all collections of photographs are photo essays. Photo essays often address a certain issue or attempt to capture the character of places and events.