1. Grid resource discovery.
Relatively new. Can borrow P2P’s idea. The important thing is to know the special need of Grid and the characteristics of Grid resource. How they are represented, what’s the candidate resources of grid resources? May refer to GLOBUS, or recent papers about Grid resource discovery. To find it’s special character, then work as a special case for P2P applications.
2. P2P resource discovery/content location/information retrieval/search/routing
The current directions, to see where can be an insertion point.
1. Single keyword search:
n Unstructured single keyword search: how to control flooding: The advanced Gnutella: Ultrapeer, Bloom filter summary, dynamic query (instead of flooding, send probing query and modify TTL and connection # for further sending.) Last hop routing: Routing tables are only exchanged between direct connected neighbors. Or using DHT broadcasting to control flooding, guarantee nodes are accessed only once.
n DHT based single keyword searching
n One hop routing: Every node has a global index. Use hierarchy + DHT
2. Multi-keyword searching: information retrieval (TF IDF, vector distance)
Semantic overlay: (1) existing: inverted list + flooding. (2) Semantic distance: use DHT (CAN in pSearch) to store, documents semantic vector is the key. Query is directed to nodes whose semantic distance is closet to the query. (3) Store every single keyword with DHT. (a) Every keyword in the query are matched and the result are intersected. (k times of the normal DHT) (b) nodes store not only the keyword, but also other keywords of the document. (eSearch)
3. Semantic search: resource metadata in XML RDF
Decompose XML/RDF ask multiple attributes, search with the same way as multi-keyword. Or more semantic factors: ontology etc.
4. Index-by-keyword and Index-by-document