Google Co-op just got del.icio.us!

Update: Sorry, link is going up and down. Worth trying, but will try to find a more stable option when time cycles free up.

This past week I decided to cook up a service (link in bold near the middle of this post) I feel will greatly assist users in developing advanced Google Custom Search Engines (CSE’s). I read through the Co-op discussion posts, digg/blog comments, reviews, emails, etc. and learned many of our users are fascinated by the refinements feature - in particular, building search engines that produce results like this:

‘linear regression” on my Machine Learning Search Engine

… but unfortunately, many do not know how to do this nor understand/want to hack up the XML. Additionally, I think it’s fair to say many users interested in building advanced CSE’s have already done similar site tagging/bookmarking through services like del.icio.us. del.icio.us really is great. Here are a couple of reasons why people should (and do) use del.icio.us:

  • It’s simple and clean
  • You can multi-tag a site quickly (comma separated field; don’t have to keep reopening the bookmarklet like with Google’s)
  • You can create new tags on the fly (don’t choose the labels from a fixed drop-down like with Google’s)
  • The bookmarklet provides auto-complete tag suggestions; shows you the popular tags others have used for that current site
  • Can have bundles (two level tag hierarchies)
  • Can see who else has bookmarked the site (can also view their comments); builds a user community
  • Generates a public page serving all your bookmarks

Understandably, we received several requests to support del.icio.us bookmark importing. My part-time role with Google just ended last Friday, so, as a non-Googler, I decided to build this project. Initially, I was planning to write a simple service to convert del.icio.us bookmarks into CSE annotations - and that’s it - but realized, as I learned more about del.icio.us, that there were several additional features I could develop that would make our users’ lives even easier. Instead of just generating the annotations, I decided to also generate the CSE contexts as well.

Ok, enough talk, here’s the final product:
http://basundi.com:8000/login.html

If you don’t have a del.icio.us account, and just want to see how it works, then shoot me an email (check the bottom of the Bio page) and I’ll send you a dummy account to play with (can’t publicize it or else people might spam it or change the password).

Here’s a quick feature list:

  • Can build a full search engine (like the machine learning one above) in two steps, without having to edit any XML, and in less than two minutes
  • Auto-generates the CSE annotations XML from your del.icio.us bookmarks and tags
  • Provides an option to auto-generate CSE annotations just for del.icio.us bookmarks that have a particular tag
  • Provides an option to Auto-calculate each annotation’s boost score (log normalizes over the max # of Others per bookmark)
  • Provides an option to Auto-expand links (appends a wildcard * to any links that point to a directory)
  • Auto-generates the CSE context XML
  • Auto-generates facet titles
  • Since there’s a four facet by five labels restriction (that’s the max that one can fit in the refinements display on the search results page), I provide two options for automatic facet/refinement generation:
    • The first uses a machine learning algorithm to find the four most frequent disjoint 5-item-sets (based on the # of del.icio.us tag co-occurrences; it then does query-expansion over the tag sets to determine good facet titles)
    • The other option returns the user’s most popular del.ico.us bundles and corresponding tags
    • Any refinements that do not make it in the top 4 facets are dumped in a fifth facet in order of popularity. If you don’t understand this then don’t worry, you don’t need to! The point is all of this is automated for you (just use the default Cluster option). If you want control over which refinements/facets get displayed, then just choose Bundle.
  • Provides help documentation links at key steps
  • And best of all … You don’t need to understand the advanced options of Google CSE/Co-op to build an advanced CSE! This seriously does all the hard, tedious work for you!

In my opinion, there’s no question that this is the easiest way to make a fancy search engine. If I make any future examples I’m using this - I can simply use del.icio.us, sign-in to this service, and voila I have a search engine with facets and multi-label support.


Please note that this tool is not officially endorsed by nor affiliated with Google or Yahoo! It was just something I wanted to work on for fun that I think will benefit many users (including myself). Also, send your feedback/issues/bugs to me or post them on this blog.

69 Responses to “Google Co-op just got del.icio.us!”

  1. » AutoGenerate A Google Custom Search Engine With del.icio.us » InsideGoogle » part of the Blog News Channel Says:

    [...] Singh, who used to work at Google, has created a really amazing tool that lets you create a highly Google Custom Search Engine without knowing how [...]

  2. nanek Says:

    Is this like http://sandbox.sourcelabs.com/kibbutz/generate.php ?

  3. zooie Says:

    Yea I saw that when researching this. The main difference is the tool that I provide actually works :) This one looks like a UI prototype or something - try a bad delicious login and it works, it doesn’t even ask for a CSE ID (how does it know where to upload to?), nor does it show you the annotations output. My tool gives you the output for the annotations, plus uses some AI tricks to also autogenerate the context.

  4. Riyaz Mohammed Says:

    Thanks, nice to see this.

    I visited eagerly but i stuck in this err

    http://idlivada.vpsland.com:8000/coop_delicious.py/design?user=webarmi&passw=?????????&cse=_cse_1_zcj0f6hhk&title=My%20Delicious&keywords=java&descr=My%20Delicious&volunteers=false&groupby=c&subrlinks=true

    here password is changed

  5. Babs Says:

    Hi, during the login process, Do you capture/record my login ID and password of del.icio.us?

  6. zooie Says:

    Thanks for the bug Riza. Try the link now it should work.
    * I wasn’t UTF-8 encoding the tag strings in my title generation step. Haven’t seen tag names like that before but hey you have every right to :) Thanks again.

  7. zooie Says:

    Hi Babs - Technically it does get captured (GET request logs) since those values are passed as CGI parameters to the URL. I will not use/sell them. I plan to delete the logs periodically.

    It’s tough to provide full encryption since I don’t have a SSL certificate. In the meantime, if you’re concerned about security/privacy, why not go to del.icio.us, change your password to some temporary value, run this wizard, then change your password back? The wizard takes like a minute to do. Later today I will (finally) encrypt passwords in all URL requests to provide a decent level of security.

    I’m also wondering if I should release this as a desktop application. Would people prefer this?

  8. Riyaz Mohammed Says:

    Please do me one favour, please my second comment and this one too.

    I think u knew why i’m saying this ;-)

  9. Riyaz Mohammed Says:

    Please do me one favour, please delete my second comment and this one too.

    I think u knew why ;-)

    Note:
    I’ve to take at least one minute to check it before posting a comment cos no editing here . One new resolution for this year to follow. :-)

  10. zooie Says:

    Hi Riza - I deleted your comment with the link that exposed your password. I’m keeping (and replying) to your latter comments so users know that they can directly email me such links in the future (check bottom of bio for contact information).

  11. Riyaz Mohammed Says:

    Thanks Singh, here’s one more bug (I hope so ;-)).

    Well, as u knew my password got exposed in last post, so i changed my password to a strong one (which includes $ * and alpha numeric)

    Here is the bug, my new password shows the err

    GetXmlResponse Error: HTTP 401 Code: Bad user/password webarmi ?????????

    Don’t worry i changed the password shown above with my real one while checking.

    One more clue for u, my old password working fine now, i hope the prob is only with my new pass contains special characters

    [Note: I double checked this comment b4 posting ;-) ]

  12. IP Says:

    Hi!

    Eager to test something new to boost my deli.cio.us account, but…

    When I try the tool I only get an error message “Failed to connect to Yahoo! Search” when I try to generate content XMl on step 2.

    Generating the annotation xml only generates an empty file…

    Perhaps I didn’t figure everything out ;)

  13. zooie Says:

    Riyaz - Done. I wasn’t escaping the password before.

    IP - In the process of doing this fix I may have messed up some connection settings. If it’s still not working for you then shoot me an email (you can find my contact info in the bio).

    Really appreciate the feedback.

  14. zooie Says:

    Just updated the service so all post login requests encrypt the password parameter. The server rekeys every 30 minutes which should provide ample time for a user to generate his/her XML. If the login does not work, it most likely happened due to the key expiration, so just try re-logging in. If all else fails, just post the issue here or email me. Thanks.

  15. zooie Says:

    Just fixed a tag parsing issue. If you were getting extremely long label names that was due to a bug. Should be fixed now.

  16. del.icio.us arsblog - Aquatic Inference Engine Says:

    [...] search mashup that google-dexes your delicious bookmarks has been unleashed on the blogosphere. You can scroll down to see it in action with my own [...]

  17. shelbycockrell Says:

    This blog was very interesting to read and I like your writing style. Nice blog!

  18. links for 2007-01-08 at Metaverse Territories Says:

    [...] Google Co-op and del.icio.us!: build a full search engine without having to edit any XML, auto-generate the Custom Search Engines (CSE) annotations XML from del.icio.us bookmarks and tags « zooie’s blog (tags: engine google mashup del.icio.us search searchengine tagging xml) [...]

  19. Sanjay Goel Says:

    Thanks for the awesome product Zooie. I have one question. Everytime I add a new delicious link, do I need to go thru this process again of creating the annonate.xml and upload.
    If that is so, is there a way to simplify that ?

  20. tom Says:

    My 3500+ account’s annotation file generated a “413- Your client issued a request that was too large.” error while loading it into coop. Can I break the file into separate bits and load them one by one.

    Tom

  21. Generate A Google Custom Search Engine to Search Your del.icio.us Bookmarks » D’ Technology Weblog — Technology, Blogging, Gadgets, Fashion, Life Style. Says:

    [...] Vik’s Blog [...]

  22. Stephen Paul Weber Says:

    I tried this, and it only seems to have moved across a few urls, as shown by this output on the sites page:

    http://www.longfocus.com/firefox/gmanager/* Firefox Extensions Google

    Include all pages whose address contains this URL
    Include just the specific page or URL pattern I have entered

    http://www.awriterz.org/Fantasy/* Awriterz Fantasy

    Include all pages whose address contains this URL
    Include just the specific page or URL pattern I have entered

    beautifulbeta.blogspot.com/2006/10/pullquotes-for-your-blog.html Article Blog Publishing

    Include all pages whose address contains this URL
    Include just the specific page or URL pattern I have entered

    http://www.eusing.com/CDRipper/CDRipper.htm Computers Entertainment Music Software

    Include all pages whose address contains this URL
    Include just the specific page or URL pattern I have entered

    wiki.rubyonrails.com/rails/pages/HowtoSetupApacheWithFastCGIAndRubyBindings Article Linux Ruby Research

    All of the tags seem to be imported, but not most of the actual bookmarks…

  23. zooie Says:

    Sanjay - For now yes. The delicious API does support an update (which will push only new links since the last call) so it’s definitely feasible. When time cycles free up I’ll add that in. Thanks for the feature suggestion.

    Tom - Yeah there’s a limit on the XML file size being pushed back through the browser. Two solutions: (1) I can save the file on the server (but I’m reluctant to use server storage at the moment) (2) My wizard allows the user to generate annotations per delicious tag. Try that - so produce annotation files for your favorite delicious tags and just upload each one sequentially in the CSE.

    Stephen - Did you check the Rank option? Or filter your bookmarks by a tag? The rank option most likely won’t do every bookmark due to the expensiveness of retrieving the Other counts (the delicious API really needs to expose these numbers in the posts/all call). If you didn’t do either, then shoot me an email (I have my contact info in my bio page).

  24. links for 2007-01-09 « timtowle Says:

    [...] Google Co-op just got del.icio.us! « zooie’s blog Incorporate your delicious entries in your Google searches (tags: del.icio.us mashup searchengine tagging) [...]

  25. Rod Guzzo Says:

    Hello,
    Can you send me an account to play with?

    Thank you,
    Rod Guzzo

  26. Tiedon hallintaa sosiaalisten kirjanmerkkien avulla? at Hypermediaa ja elämää. Says:

    [...] Singh, V. 2007. Google Co-op just got del.icio.us!. Saatavilla www-muodossa: http://zooie.wordpress.com/2007/01/03/google-co-op-just-got-delicious/. [...]

  27. Harish TM Says:

    Hey, Great tool, It would be nice if you could go into the xml creation a little thought … that will help further development of similar search engines…

  28. Nita Singh Says:

    Great blog Vik! The material is interesting and smart and I enjoy following it. Continued success!

  29. Library clips :: Google CSE and dynamic OPML :: February :: 2007 Says:

    [...] Also see how to use it to search your del.icio.us account. [...]

  30. Ann Hulton Says:

    Hello,

    Very interesting work. I wonder if you would send me an account so that I could play with the CSE. Thanks!

  31. jcs_goog Says:

    Great tool. Unfortunately I have lots of tagged URL’s and Google limits this to 2000. Any suggestions?

  32. Matthew Says:

    Don’t seem to be able to get your link http://basundi.com:8000/login.html to resolve… what’s going wrong?

  33. zooie Says:

    Hi Matthew - It works for me. Hmm. Try again (refresh and clear the cache if necessary). It should be working.

  34. GooglePowerSearch - Steve Says:

    Check this out http://www.googlepowersearch.com.
    I created GooglePowerSearch so you can power search for Video, News, Maps, Images and more…
    Google Power Search helps to unleash the built in power of Googles special features.
    Using Google Power Search you are able to get better-targeted results.
    Check out Google Power Search and let me know what you think.
    Thanks
    Steve

  35. caio Says:

    I am liking this idea

  36. maarcis Says:

    yours is the second blog (or actually third maybe) I have ever bookmarked (yea I don’t use aggregators)
    I already was like deleting my co-op account then I read through this post once and then TA-DA
    http://taxa.search.googlepages.com/home
    I even licensed it with same exact license as you had just to be sure.
    but I think I have comitted at least a dozen of copyright infringements as well

    I always get all giggly seeing the google labs logo but this just close to too exciting
    so yea thanks for pointing out how it’s done and I haven’t even started with implementing that facets x labels thing which sounds great (probably first I have to make a del.icio.us account)
    so yea this blog has been valuable content for me.

  37. Top 5 Daily Questions About Google Says:

    [...] VideoGoogle GroupsGoogle MapsGoogle NewsHow to Get Detailed PPC Keyword Data from Google AnalyticsGoogle Co-op just got del.icio.us! « zooie’s blogGoogle Code - Updates: Four Google open source tools on Google CodeNo comments yet.RSS for comments [...]

  38. Farhan Memon Says:

    Vik –

    Great tool. How would I take one of my subscriptions and turn it into a CSE.

    Let’s say that I’ve subscribed to the tag San Francisco, can I use your tool to take that subscribtion and generate web URLs that I can feed back into CSE?

  39. zooie Says:

    Hi Farhan - I would recommend looking into the OPML upload feature available in the Advanced tab of the CSE’s control panel. This will take OPML (and various RSS feed formats), extract its URL’s, and import them directly into the CSE. My tool currently just supports a user’s bookmarks available via del.icio.us’s API.

    The other option (in case the OPML feature does not work) would be to regex out the URL’s and pump them into a flat file (each link new-line separated), then paste the links in the sites box (Sites tab).

  40. Bikinblogger Says:

    Hi VIk’s,

    Ok, I give up, just gimme your dummy in my email :(
    Thanks pal…

  41. Danny Zacharias Says:

    could this work with ma.gnolia bookmarks?

  42. zooie Says:

    Yeah it’s possible if there’s an API or XML feeds available for retrieving the bookmarks. When time cycles free up I’ll look into that.

  43. Gavin Says:

    Throw me an account, this looks awesome. Was thinking of purpose-building an app to do same, but running with Google is even better. Any chance of getting a copy of this to run on my own server/alter?

  44. lory Says:

    I use delicius with my blog every day.

  45. davidrothman.net » Blog Archive » Social Search for Health Librarians Says:

    [...] own collaboratively-created Google CSE or Swicki of favorite, subject-specific sites (or have a CSE generated from a del.icio.us account’s links). Librarians should seek to be familiar with technologies for finding and organizing online [...]

  46. Tawia Says:

    is this different with the google search for your domain thing that they offered long time back

  47. Sundaize Blog Says:

    Sundaize Search Engine

    Sundaize Search Engine

  48. valeriana Says:

    I use delicious evary day for my bookmarks great tips
    Thanks

  49. Danny Zacharias Says:

    Any news on the ma.gnolia integration?

  50. Danny Zacharias Says:

    Zooie, any update on the ma.gnolia integration?

  51. Danny Zacharias Says:

    oops, sorry for the double post :(

  52. zooie Says:

    Hey Danny - Sorry for the delay. Haven’t had a chance to get to it. You might want to look at the Google Custom Search site. They have a new feature called ‘Linked CSE’ - I think this might do what you want.

  53. Danny Zacharias Says:

    I looked at the linked CSE, your plugin would suite my needs much better. I’ll keep watching this space - crossing my fingers for ma.gnolia integration :-)

  54. Matt Foster Says:

    That is a very slick implementation. I have been looking into the custom search engine and more specifically their Ajax API. I created a javascript class using the prototype.js library to allow for a completely customizable ajax search. They now support binding an GWebSearch Object to a specific CSE.

    http://positionabsolute.net/blog/2007/08/implement-custom-search.php

    Cheers,
    Matt

  55. EB Says:

    I can’t get your page to load! Dying to see what you’ve done here, but it keeps telling me the connection timed out. Any hints?

  56. zooie Says:

    Hi EB - Sorry, I was running it on my friend’s server and I think I may have over-welcomed my stay ;)

    Let me see what I can do.

    – Vik

  57. David Westbrook Says:

    Any luck relocating this service? I’m eager to try it!

  58. zooie Says:

    Hi David - Not yet - Sorry! Anyone out there got a server available running apache/mod_python?

  59. Jeremy Says:

    Zooie, I am interested in helping you out - However, do you have any idea how much cpu usage and bandwidth does your app require?

  60. nemti Says:

    I stumbled across this page while searching for a way to search my Ma.gnolia bookmarks. I wasn’t able to find anything else, so I wrote my own little tool (http://nemti.awardspace.com/goo.gnolia/). It’s just a rather simple implementation of Google linked CSE. Yours sounds much more featureful - I hope you can find a host, and it would be great if you could add Ma.gnolia support.

  61. Kristen Says:

    I tried this, but when I uploaded my XML annotations and skeleton, I got an “error parsing XML at line 3″ message in both cases. Can you tell me what I did wrong? Thanks.

  62. zooie Says:

    Hi Kristen - Good chance Google changed their XML formats since I developed this tool. Could you send me your XML (the one which produces the bug)? vik.singh [at gmail]. Thanks.

  63. L.A. Buddy Says:

    As I see deligoo.com do the same thing with del.icio.us search, but you must install their plugin.

  64. Charlie’s path to dEAth » 用Google自定义搜索引擎(CSE)搜索del.icio.us Says:

    [...] zooie’s work I almost duplicate it besides Linked CSEs introduced. Then you need not to download and upload the annotations XML file and you can copy a piece of code to your page then get the cse toolbar. [...]

  65. Google blog» Архив блога » Here’s some love for our Custom Search friends Says:

    [...] Importing del.icio.us bookmarks for CSE (aka Vik’s tool) [...]

  66. Saved: Tiedon hallintaa sosiaalisten kirjanmerkkien avulla? at Ip’s Says:

    [...] Singh, V. 2007. Google Co-op just got del.icio.us!. Saatavilla www-muodossa: http://zooie.wordpress.com/2007/01/03/google-co-op-just-got-delicious/. [...]

  67. jeroen Says:

    hi,

    we used google cse and delicous to generate results based on all delicious bookmarks and created refinements based on the tags. Check it out at http://www.scoofers.com

  68. SearchFiles Says:

    Thx…

  69. del.icio.us driven Google custom search « The Ancient Geeks Says:

    [...] a CSE that searched over sites I have tagged in del.icio.us. I found a couple of examples of this. One wanted my delicious username and password. No thanks. The other, deligoo, looks good, but wanted me [...]

Leave a Reply