|
|
COMPUTER RESEARCH & TECHNOLOGY |
|
I love sliders for searches and now Yahoo's given us all a new one to play with. Their Mindset search allows you to use a "slider-bar" to specify whether you're doing research or shopping and you can try out their test model right now at http://mindset.research.yahoo.com/ The slider allows you to move left for more shopping-type results, and right for more research-type results. So the best use for this search is for things that have strong representation in both shopping and research pages -- pretty much any brand name, most nouns, etc. It's great fun to play with. What does this Mindset thingy do again? Well, it allows you to sort your search results for your query into either commercial or non-commercial (informational) results, based on whether you're shopping or seeking information. So in this context "commercial" will imply that the primary purpose of a particular web page is to sell you something. Informational suggests that the primary purpose of the page is more about providing information related to your search. But aren't many web pages a combination of both commercial and informational? Yes many are, Yahoo say that's why they assign each page a relatively continuous score ranging from -2 (most commercial) to +2 (most informational). Pages that are scored at a zero are considered a balance of commercial and informational. They then use machine-learning technology to score web results. Remember, however that this software is still a work in progress, so some results might vary radically. Even so the scoring while not perfect, is usually to good enough to get started. So how does it "actually" work then and what's does that slider thing at the top of the page do? Firstly, you enter your search into a search box - just like any search engine and then just press the search button. Then, you control the slider to decide how you want the results sorted. The midpoint of the slider represents the default setting. In this position, the order of results matches Yahoo! Search web results. As you move the slider right, toward "researching" or left toward "shopping" the results are automatically re-sorted for you. There are two different sorting mechanisms at work here: default sorting done by Yahoo! Search; secondary sorting based on assigned commercial and non-commercial scores. With the slider in the middle position, only the default Yahoo! Search sort is used. When the slider is at either end, only the secondary commercial/non-commercial sort is used. But when the slider is anywhere in between, Yahoo! Mindset presents a blend of the two sorting systems. The more the slider is moved toward either extreme, the more weight given to the second sorting method. What are those little blue and orange bars under each result? These coloured bars represent the scores. A longer coloured bar represents a higher absolute score for the result, and the more definitively commercial (blue on the left) or informational (orange on the right) result. Scores with neither blue nor orange bars are 0 scores. This means that Mindset has determined the page is equal parts commercial/non-commercial or completely ambiguous. There is a grey number in brackets alongside each rank that represents the default rank for each result. Notice that with the slider set in the middle position, the displayed rank numbers and the grey default rank numbers are the same. But when you move the slider right or left, results are rearranged. What is the nature of the technology behind this program? This Mindset system is an example of machine learning applied to the problem of text classification. Machine learning and text classification are two different fields of technical research that found common cause about ten years ago with the emergence of the Web. Text classification refers to the problem of classifying documents automatically into different subject categories. Why is text classification useful for the Web? The unstructured abundance of web documents presents a new and challenging text classification challenge. Accurate, automated classification can help users find the information they seek on the Web. Tell me again why is Machine Learning useful for the Web? Machine learning is especially useful for applying human-like behaviour to sets of data (information for most - other than technical people) so large that it would be infeasible for humans to do the work. When the Web took off about ten years ago, machine learning acquired a cherished prize: a huge, and ever-growing body of information. With billions of pages and counting, the Web is too big for humans to cover entirely. This is where machine learning comes in. Arthur Hissey |
|
RELEVANT LINKS |
|
|
|
|
|
ETOPICS |
|
Keep up to date with the latest in the IT/Communications industry by listening to ABC Local Radio on FM107.1, every Tuesday morning at 9.15AM. Computer Research & Technology Managing Director Arthur Hissey and Morning Host Janice McGilchrist will be discussing current matters of interest and future directions in the IT industry. Transcripts of these discussions and other topics are available, just click on the links. |
|
ETopic Archives |
| Check out the ETopic Archives |
| Full Archive List |
| Browse Alphabetically |
| A - E |
| F - J |
| K - O |
| P - U |
| V - Z |
| Last 5 ETopics |
| A Map? On Flickr? Is that a question? |
| Net ID scheme offers passport to online safety, especially for children online |
| What is ViewDo? ViewDo Helps People Help Themselves |
| Australian Dictionary of Biography Online |
| Google Earth Revisited |