A few thoughts on how the whole Web 2.0 hype thing might interfere with search engines.
I recently looked a bit into programming AJAX functionality in PHP for a closed project for the Red Cross. I manly used it to implement an “edit-in-place” functionality, which might be know from sites like Flickr and others. In another project, which I will announce here shortly, I used JavaScript and AJAX in JSP/J2EE.
After looking into all the new possibilites that come up with AJAX I came to think a bit about how Searchengines index pages and how the semantic web might be influenced by those new technologies. If people use AJAX more and more (which I hope they do) to create less web-like user interfaces which update information dynamically, searchengines won’t be able to get a view of all the information available on a specific website.
The possible solution I came up with might be something like a mashup between robots.txt and webservices. If a web aplication could offer a webservice for search robots that spits out XML rendered content of the information available on the page (behind the scenes in the database) the searchengine could easily index it and map it’s context, available in the XML structure, to the content. Another advantage would be that the sites could determine exactly which information should be found by searchengines and which should only remain on their site.
One offspring of this concept would be that services like the UDDI could be build up, that will be searched by the search robots, thus making it very easy to promote websites in a very descriptive manner. (Remind me to start such a directory website, when the concept kicks off. So I can charge customers for being listed and make loads of money 🙂 )