Why are scientific collaborations so difficult to sustain? It has been natural to think of scientists as being potentially really good at collaboration, but attempts to set up computer-based collaborative projects (collaboratories) within the scientific community haven’t been too successful. It seems while they are good collaborators, they function best in localised, face to face groups.
Three areas of difficulty are identified;
1. Knowledge (as opposed to mere information) is hard to transmit across distances. It’s much easier for a scientist to explain his ideas, which may be on the cutting edge of understanding, directly to a colleague than to someone over a computer network.
2. Scientists work independently most of the time. They are inclined to work to their own research and travel schedules.
3. They typically work for institutions, and there are traditionally difficulties with cross-institutional barriers. Legal issues may need to be resolved, and there is often a lot of protectiveness over intellectual property.
To help resolve these difficulties, The Science of Collaboratories (SOC) was a five-year project funded by the National Science Foundation (NSF) to study large-scale academic research collaborations across many disciplines. The goals were to compare different collaborative projects, develop theory about this emerging research form, and develop strategies for facilitating more successful projects in the future. They ended up coming up with a seven category taxonomy of collaboratories. Taxonomy is the science or technique of classification. (Why not just say “categories”? You see this is why we’ve got point 1. above.)
There follows a very long-winded and utterly riveting account of what defines a collaboratory, the kind of sampling techniques used, and a bit with “prototypicality” in it.
Of the 7 categories explained, I a few notable ones were;.
Shared Instrument: This category is set up to allow researchers to get access to expensive or normally inaccessible equipment. An example is given of twin telescopes in Hawaii, which due to their remote location can be accessed remotely from several subscribed universities. This kind of observation produces a very large amount of data which needs to be dealt with.
A Community Data System: An information resource that is created, maintained, or improved by a geographically-distributed community. The example is given of the Protein Data Bank (PDB) which processes and distributes 3d structure data of proteins and molecules. Interestingly, this project and ones like it often lead to great advances in 3d modelling and data visualisation techniques to deal with the large datasets produced.
Open Community Contribution System: This is a group of often geographically separated people who unite to work on a specific research problem. The interesting thing is that it often involves members of the general public, and encourages them to help deal with projects in the form of work, not necessarily contributing data. Wikipedia is given as an example, but it also reminds me of the usefulness of amateur astronomers, who by sheer strength of numbers can monitor large portions of the sky that professionals can’t, and have made many important discoveries.