Cellular Consequences of Genetic variation: 04/01/2006

Tuesday, April 25, 2006

Engineering a scientific culture

In a commentary in Cell, Gerald Rubin describes Janelia Farm, the new research campus of the Howard Hughes Medical Institute. If you cannot access the commentary, there is a lot of information available on the website such as this flash presentation (oozing with PR talk).

In summary (as I understood it) the objective is to create a collaborative working environment where scientist can explore risky and long term projects without having to worry about applying for grants and publishing on very regular basis.
Group leaders in Janelia Farm will
- have small groups (two to six)
- not be able to apply to outside funding
- still work in the bench

Unless you are really interested in managing resources and all the hassle of applying for grants, this sounds very appealing.

Also, there is no limit on the amount of time the group leader can stay at Janelia Farm, as long as they pass a review process every 5 years. This is unlike for example here at EMBL where most people are forced to move after 9 years (there is a review process after 5 years).

Since the main objectives of Janelia Farm is to work on long term projects that can have significant impact, the review process will not focus on publications but on more subjective criteria like:
"(1) the ability to define and the willingness to tackle difficult and important problems; (2) originality, creativity, and diligence in the pursuit of solutions to those problems; and (3) contributions to the overall intellectual life of the campus by offering constructive criticism, mentoring, technical advice, and in some cases, collaborations with their colleagues and visiting scientists"

Sounds like a researchers paradise :), do the science we will do the rest for you.
It will be interesting to see in some years if they manage to create such an environment. The lack of very objective criteria and no limit on the stay in the campus might lead to some corruption.

Friday, April 21, 2006

Posting data on your blog

From Postgenomic I read this blog post in Science and Politics on science blogs. Bora Zivkovic describes in his post the different types of science blogging with several examples. The most interesting part for me was his discussion of posting hypothesis and unpublished data. I was very happy to see that he already had some post with his own unpublished data and that the discussion about science communication online is coming up in different communities.

His answer to the scoop problem :
But, putting data on a blog is a fast way of getting the data out with a date/time stamp on it. It is a way to scoop the competition. Once the data are published in a real Journal, you can refer back to your blog post and, by doing that, establish your primacy.

There are some problems with this. For example, people hosting their blogs can try to forge the dates, so it would be best to have a third party time-stamping the data. Postgenomic would be great for this, there could be another section in the aggregator to track posts with data. Some journals will probably complain about prior publication and decline to publish something already seen in a blog.

The problems with current publishing systems and the agonizing feeling of seeing your hard work published by other people will probably help drive some change in science communication. Blogging data would make science communication more real-time and transparent, hopefully reducing the number of wasted resources and frustrations with overlapping research.

This is a topic I usually come back to once in while so I have mentioned this here before. The stream like format of the blog makes it hard to keep posting all the relevant links on the topic so I think from now on I will just link to the last post on the topic to at least form a connected chain.

Tuesday, April 11, 2006

Stable scientific databases

The explosion of scientific data coming from high throughput experimental methods has lead to the creation of several new databases for biological information (protein structures, genomes, metabolic networks and kinetic rates, expression data, protein interactions, etc). Given that funding is generally attributed for a limited time and for defined projects it is possible to obtain money to start a database project but it very difficult to obtain a stable source of funding to sustain a useful database. I mentioned this before more than once when talking about the funding problems of BIND.
In this issue of The Scientist there is a short white paper entitle "Save our Data!". It details the recommendations of The Plant Genome Database Working Group for the problems currently faced by the life science databases.

I emphasize here four point they make:
2. Develop a funding mechanism that would support biological databases for longer cycle times than under current mechanisms.
3. Foster curation as a career path.
6. Separate the technical infrastructure from the human infrastructure. Many automated computational tasks do not require specialized species- or clade-specific knowledge.
7. Standardize data formats and user interfaces.

The first and last points were also discussed a recent editorial in Nature Biotech.

What was a bit of a surprise for me is their 3rd point on fostering curation as career path. Is it really necessary to have professional curators ? I am a bit divided between a more conservative approach at data curation with a team of professional curators or a wisdom of the crowds type of approach were tools are given to the communities and they solve the curation problems. I think it would be more efficient to find ways to have the people producing the data, curating it automatically into the databases. To have this happen it has to be really easy and immediate to do. I still think that journals are the only ones capable of enforcing this process.

The 6th point they make is surely important even if the curation effort are to be pushed back to the people producing the data. It is important to make the process of curating the data as automatic and easy as possible.

Friday, April 07, 2006

Retracted scientific work still gets cited

Science has a news focus on scientific misconduct. A particular study tracked the citation of papers that were already retracted. They found that scientists keep citing retracted papers.
Some editors contacted by Science said that they do not have the resources to look up every citation in every paper to help purge the literature of citations to retracted work. In my opinion this is not such a complicated problem. If journals agreed to submit to a central repository all retractions, then the citations could very easily be checked against the database and removed. Even with such an automatic system , scientists should have the responsibility to keep up with the works being retracted in their fields.
Since retractions are publicly announced by the journals pubmed has already some of this information available. If you search for retraction in the title in pubmed you can see several of these announcements (not all are retractions). In some cases, when you search for a the title of a retracted paper you can see in pubmed a link to the retraction but this is not always the case. All that is needed is for publishing houses to agree on a single format to publish retractions and repositories to make sure all retractions are appended to the former entries to the same publication.

Tuesday, April 04, 2006

Viral marketing gone wrong

The social internet has emerged as an ideal ground for marketing. People enjoy spreading news and in the internet meme spreading sometimes resembles a viral infection propagating throughout the network.
Some companies like Google have made their success on this type of word-of-mouth marketing. If you can get a good fraction of the social internet to be attached to your products in such a way that they want to tell their friends all about it , you don't have to spend money in marketing campaigns.
The important point here is that a fraction of people must be engaged in the meme, they must find it so cool and interesting that they just have to go and tell their friends and infect them with the enthusiasm. How do you do this ? That's the hard part I guess.
So, the marketing geniuses of Chevrolet decided that they would try their hands at viral marketing. To get people engaged they decided to have the masses build the ads. We usually like what we build and we want to show it to our friends, so the idea actually does not sound so bad right ?! :) well , this would have been a fantastic marketing idea, if most people actually had good things to say about the product.

Here is an example of the videos coming out from the campaign:

I worried before that this type of marketing could be a negative consequence of science communication online but these examples just show that directing attention alone is not enough, people will judge what they find and are free to criticize.

Monday, April 03, 2006

The Human interactome project

Marc Vidal has a letter in The Scientist urging scientist and funding agencies to increase efforts to map all human protein interactions. He suggests that different labs work on different parts of the huge search space (around 22000^2 excluding splice variants) and of course that funding agencies give out more money to support the effort. He makes an interesting point when he compares funding for genome projects with interactome mapping. I also think that the interactome mapping should be view in the same way has genome sequencing and that the money invested would certainly result in significant progress in basic and medical research.
The only thing I would add to my own wish list is that some groups would start comparative projects at the same time. Even if it takes longer to complete the human interactome it would be much more informative to have of map of the ortholog proteins in a sufficiently close species to compare with (like mouse). Alternatively some funding could go specifically to comparative projects studying for example the interactomes of different yeasts (it is easy to guess that I would really really like to have this data for analysis :).