YQL – Yahoo! Query Language

YQL - Yahoo! Query Language Logo

The Yahoo! Query Language is an expressive SQL-like language that lets you query, filter, and join data across Web services.

YQL’s possibilities are virtually endless, say you want to get specific Flickr Images containing a defined word in the title, or you want to geo-code some addresses on the fly. YQL makes those tasks extremely easy by just forming a simple query that gathers the data. Output can be switched between JSON and XML, so you can choose whatever fits best for your application.

The best way to go about using a YQL service is as follows:

  • Construct your query using the YQL Console and try out if it gives you the right result.
  • Copy the REST Query URL the console gives you at the bottom and insert it into your web app. There are even examples in the documentation on how to use REST queries in different programming environments.

Let me give you an example of such a query:

select * from upcoming.events where woeid in (select woeid from geo.places where text="Vienna, Austria")

This will give you an XML response listing all upcoming events in Vienna, Austria using the Yahoo! Upcoming API by calling the following REST URL:

http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20upcoming.events%20where%20woeid%20in%20(select%20woeid%20from%20geo.places%20where%20text%3D%22Vienna%2C%20Austria%22)%0A

This service makes it very easy to use different web APIs without knowing their respective syntax and it becomes extremely convenient when you start joining together different webservices to get a combined result.
I’ll give you one more example here. This gets all the Foursquare places around a particular address sorted by their distance by combining geo.placefinder and Foursquare :

USE "http://www.datatables.org/foursquare/foursquare.venues.xml" as venues;
SELECT group.venue.name, group.venue.address, group.venue.stats.herenow, group.venue.distance FROM venues WHERE (geolat, geolong) IN (SELECT latitude, longitude FROM geo.placefinder WHERE text="koenigsklostergasse 7, wien") | sort(field="group.venue.distance", descending="false");

Click here to try this query in the YQL Console.

The really neat thing is that YQL can be used commercially and there even is a fairly decent rate limit applied per IP address that should make it useful in production.

Usage Information:

  • YQL can be used for commercial purposes.
  • If we’re going to shut down YQL, we will give you at least 6 months notice with an announcement on YDN and in our forum.
  • YQL has a performance uptime target of over 99.5%.
  • YQL relies on the correct operation of the Web services and content providers it accesses.

Rate Limits:

  • Per application limit (identified by your Access Key): 100,000 calls per day.
  • Per IP limits: /v1/public/*: 1,000 calls per hour; /v1/yql/*: 10,000 calls per hour.
  • All rates are subject to change.
  • YQL rate limits are subject to the rate limits of other Yahoo! and 3rd-party Web services.

Some resources:

Searchengines and AJAX

A few thoughts on how the whole Web 2.0 hype thing might interfere with search engines.

I recently looked a bit into programming AJAX functionality in PHP for a closed project for the Red Cross. I manly used it to implement an “edit-in-place” functionality, which might be know from sites like Flickr and others. In another project, which I will announce here shortly, I used JavaScript and AJAX in JSP/J2EE.

After looking into all the new possibilites that come up with AJAX I came to think a bit about how Searchengines index pages and how the semantic web might be influenced by those new technologies. If people use AJAX more and more (which I hope they do) to create less web-like user interfaces which update information dynamically, searchengines won’t be able to get a view of all the information available on a specific website.

The possible solution I came up with might be something like a mashup between robots.txt and webservices. If a web aplication could offer a webservice for search robots that spits out XML rendered content of the information available on the page (behind the scenes in the database) the searchengine could easily index it and map it’s context, available in the XML structure, to the content. Another advantage would be that the sites could determine exactly which information should be found by searchengines and which should only remain on their site.

One offspring of this concept would be that services like the UDDI could be build up, that will be searched by the search robots, thus making it very easy to promote websites in a very descriptive manner. (Remind me to start such a directory website, when the concept kicks off. So I can charge customers for being listed and make loads of money :) )