Sunday, July 23, 2006

Opening up the scientific process

During my stay at the EMBL, for the past couple of years, it already happened more than once that people I know have been scooped. This simple means that all the hard work that they have been doing was already done by someone else that manage to publish it a bit sooner and therefore limited severely the usefulness of their discoveries. Very few journals are interested in publishing research that merely confirms other published results.

From talking to other people, I have come to accept that scooping is a part of science. There is no other possible conclusion from this but to accept that the scientific process is very flawed. We should not be wasting resources literally racing with each other to be the first person to discover something. When you try to explain to non-scientist, that it is very common to have 3 or 4 labs doing exactly the same thing they usually have a hard time integrating this with their perception of science as the pursue of knowledge trough collaboration.
I am probably naïve given that I am only doing this for a couple of years but I don’t pretend to say that we do not need competition in science. We need to keep each other in check exactly because lack of competition leads to waste of resources. I would argue however that right now the scientific process is creating competition at wrong levels decreasing the potential productivity.

So how do we work and what do we aim to produce? We are in the business of producing manuscripts accepted in peer reviewed journals. To have competition there most be a scarce element. In our case the limited element is the attention of fellow scientist. Given that scientist’s attention is scarce we all compete for the limited number of time that researchers have to read papers every week. So the good news is that the system tends to give credit to high quality manuscripts. This means that research projects and ongoing results should be absolutely confidential and everything should be focused in getting that Science or Nature paper.
I found a beautiful drawing of an iceberg (used here with permission from the author, David Fierstein) that I think illustrates the problem we have today by focusing the competition on the manuscripts. Only a small fraction of the research process is in view.


Wouldn’t it be great if we could find a way to make most of the scientific process public but at the same time guaranty some level of competition? What I think we could do would be to define steps in the process that we could say are independent, which can work as modules. Here I mean module in the sense of a black box with inputs and outputs that we wire together without caring too much on how the internals of the boxes work. I am thinking these days about these modules and here is a first draft of what this could look like:


The data streams would be, as the name suggests, a public view of the data being produced by a group or individual researcher. Blogs are a simple way this could be achieved today (see for example this blog). The manuscripts could be built in wikis by selection of relevant data bits from the streams that fit together to answer an interesting question. This is where I propose that the competition would come in. Only those relevant bits of data that better answer the question would be used. The authors of the manuscript would be all those that contributed data bits or in some other way contributed for the manuscript creation. In this way all the data would be public and still a healthy level of competition would be maintained.
The rest of the process could go on in public view. Versions of the manuscript deemed stable could be deposited in a pre-print server and comments and peer review would commence. Latter there could still be another step of competition to get the paper formally accepted in a journal.

One advantage of this is that it is not a revolution of the scientific process. People could still work in their normal research environment closed within their research groups. This is just a model of how we could extend the system to make it mostly open and public. The technologies are all here: structured blogging for the data streams, wikis for the manuscripts and online communities to drive the research agendas.

I think it is important to view the scientific process as a group of modules also because it allows us latter to think of different ways to wire the modules together. Increasing the modularity should permit us to innovate. For example we can latter think of ways that the data streams are brought together to answer questions, etc.


7 comments:

Jean-Claude Bradley said...

Pedro - thanks for using our UsefulChem experiments blog as an example of the modular approach that you discuss. You actually bring up a number of points that are important for the future of science and how it actually gets done. As you point point, as scientists we have to ask ourselves what business are we in? Are we in the business of producing manuscripts or increasing knowledge? I am still of the opinion that it is possible for scientists to operate in full transparency, as we are trying to do in our malaria research. I don't think we need to keep black boxes. The key point is that everyone's contribution is made available. This is a reason that we have moved our experiments in progress from a blog to a wiki - so that everyone's contribution can be tracked in the history. Using Wikispaces to do this also gives us a third party time stamp, which may come in handy if disputes arise.

Neil said...

You crazy idealistic dreamer :)

I suppose that advocates of the current system would argue that it represents some sort of Darwinian struggle, Those who win the publication race are the "winners" and in theory, produce the best science. Of course we all know that this is not true. What we're seeing is survival of the biggest and best-funded. A lab with 10 PIs on a problem should in theory always "beat" one solitary guy.

Unfortunately, competition arises through career pressure. To answer jean-claude, yes, a lot of scientists believe that they are in the business of producing manuscripts. Because when it comes to getting tenure or some permanent job, number of manuscripts is all that counts. Established researchers perpetuate this, as they advise younger researchers that this is the only path to success. They care little for your contributions to online communities.

The great thing about your model is that it doesn't preclude anyone from publishing as much as they want/need to and science as a whole benefits. I really think that these are exciting times and that the way research is done and published really will change radically in the next few years. It's great that young researchers are coming through and pushing this agenda. It's going to be a long, hard struggle against the academic establishment.

Rosie Redfield said...

I guess I'm a crazy idealistic dreamer too.

I run a research lab. We're not plagued by competitors (I actually wish a few people would work on our problem), and I've already taken some steps towards making our research more open, by putting our research proposals on line as soon as they're submitted, and by using open-access journals whenever possible.

But this post gave me the idea of using a blog to describe the ongoing process of doing and thinking about the research we do. I'd use it to describe/explain (mainly to myself) the scientific issues I'm thinking about: what experiments we've done, what the results were if they worked (or possible the explanations for why they didn't work), what I think the results mean for the questions we're trying to answer.

So I started a blog; it's up at http://rrresearch.blogspot.com. This is my first try at blogging, so it's not very slick yet. But I'm already thinking of engouraging the members of my lab to (1) read my blog regularly, and (2) keep their own blogs where they write about the progress of their research.

Thanks.

Anonymous said...

While the current system has its faults, I am not convinced complete openess is the answer. Sadly there are too many selfish people in research (one of the reasons the current system fails) and results and ideas would probably be stolen by big labs with resources and connections. There is also the issue of funding - will one name in the middle of a big paper get you the funding you need? Also, what if someone uses your data to make conclusions you're not happy with? Yes we need a solution, but is this it?

Francois Rivest said...

About the point on first to publish wins, second don't get published, I think something should be added. Good science should not only be reproducible, but be reproduced for confirmation. Whether you are first or second, both, if of quality, should be published. It is sad if publishing does not work always that way.

Pedro Beltrão said...

Thank you all for your comments and criticisms.
Anonymous: I am not sure that this is a good model, I am mostly trying to propose that the current competition is misplaced and leads to waste of resources. If we agree on this then it is in our interest to think of alternatives. We are trained to think of solutions to problems so maybe we can find one for this problem as well.

There will always be a percentage of people that will try to game the system but today I can get scooped by someone that does not even know that I am working on the same thing. We are just so many scientists reading all the same things that it is obvious that we end up having the same ideas. In a more open model and given that I could show when and how I came up with some piece of result maybe it would be easy to spot when someone is stealing results.

As to funding, if there is really a large number of people working on a problem them maybe there will be more than one paper to publish. Not like today when sometimes I read a paper on my field look at the list of authors and think to myself what did all these people do?

Finally , if my data proves a point that I am not happy with, thought. We are discovering nature not my own opinions about it.

Pedro Beltrão said...

A friend told me that my comment sounded a bit aggressive. I guess I should have put a smiley face in there somewhere :). I was trying to be logical and I guess it came out dry.

Post a Comment