Make Your Genes All Point the Same Direction

→ → → → → → → → → → → → → → → → → → → → → → → →
The title of the post is basically the entire lesson here. If you are designing multi gene constructs and don’t want to read this whole page, just make all your genes face the same direction OK?
→ → → → → → → → → → → → → → → → → → → → → → → →

Before we started hiring lots of people at Mozza I made the executive decision that we would be using the MoClo cloning system. Later on we switched to the PhytoBrick standard, which is almost the same, and eventually started making our own bespoke solutions to accelerate certain build chains that were frequently encountered. Anyways, we experimented with letting everyone (lab techs, research associates) design plasmids, I think to give people complete ownership of the project they were working on. The idea was that if individuals were responsible for everything like inventorying their own reagents and scheduling their experiments (and designing and assembling their own plasmids) then any failure to meet deadlines was 100% on them. This system didn’t last long but one thing that did persist longer than it ought to was poor plasmid designs.

One of the features of MoClo is that you end up with lots of reusable, intermediate parts that can be used to build stuff in the future without needing to start from scratch. In MoClo parlance one of these intermediate things is a “Level 1 transcriptional unit.” You may recognize this as a “gene.” Over time we build up a sizable collection of these Level 1 transcriptional units. We simultaneously start building up institutional knowledge about the properties of these genes. Some are company-wide favorites and some are regarded as duds. And inevitably, as with anything biological, we develop some mysticism around some of these genes. We don’t know exactly why this one works well so we are hesitant to tweak it. Any suggestion to change things means abandoning the intermediate work to start from scratch and the possibility of breaking whatever magic our favorite gene has.

Another “feature” of MoClo is that the direction each gene can be chosen independently. So for a given multi gene plasmid the overall design could look like ← → ← → or → ← ← ← etc. For those readers who made it past the first line but are getting antsy, what you should always always do is → → → → .

The scare quotes around the word feature should tell you what I really think about it. I guess to some extent more options is always a feature rather than a bug because no one is forcing you to use those features. It just leads to buggy designs if you don’t know any better.

Narrator’s voice: He didn’t know any better.

Getting back to the institutional inertia - if it was just intermediate plasmids that had already been built and QC’d that was preventing us from making changes, I would have waved my wand and proclaimed that we are going to rebuild some plasmids from scratch. With our build process that would only be a week setback at most. But there is an even stronger inertia that we accumulated. We have some experiments that, in part, rely on qPCR to tell us the relative expression level of a handful of genes. Partially to allow us to correct for differences in transformation efficiency in the math at the end. With qPCR you always need to pick one sample, design, individual as the reference that everything is always compared to. So we naturally begin to have one favorite plasmid that we always compare everything to. We have so much knowledge built up about how that one behaves in different circumstances that if we change reference plasmids, a chasm would form between the “before experiments” and the “after experiments”. All the experiments on one side of the divide are compared to each other but it becomes difficult to think about comparing experiments spanning the divide since we introduced new variables.

So there is a very good reason to not touch the reference plasmid. But we start to have concerns about the design of this plasmid We suspect that the reason that we can’t detect one of the casein genes in many assays is maybe because we haphazardly oriented two adjacent genes to face each other like → ←. We suspect that if the transcription of the second gene ever runs long it will transcribe part of the first gene, which could then hybridize with transcripts of the first genes to make double stranded RNA, which then gets processed into siRNAs that suppress the first gene through the RISC pathway. Confirming that mechanism would be a lot of work. Like a lot a lot. So instead we just want to reorient the genes to avoid that. But now we must abandon a lot of perfectly valid and well executed experiments on the other side of the chasm. It is hard to convince yourself that it’s worth it.

overlapping transcripts may lead to RISC

You can try to make your new reference plasmid in a certain way to allow for comparing it directly to your old reference plasmid, creating a bridge to compare the new experiments back to the old experiments. But there are always caveats with this. In my experience the difficulty in creating this logical bridge grows with like the cube of the number of changes you make to the reference designs.

The moral of the story, if you haven’t picked it up already is to make your genes all face the same direction. Biology is always complicated and messy. And maybe there is some rare use case for a gene oriented reverse of all the rest. But unless you are very confident you have one of those cases, do yourself a favor and take my advice.

I suspect that this is one of the primary ways in which transgenes get silenced in plants. Your TDNA lands in a random place. Maybe it’s in a coding sequence or intron or downstream of a coding sequence. In soybean this frequently doesn’t cause any obvious problems because of the amount of gene duplication, as I mentioned in this essay. But the promoter of this interrupted gene keeps chugging along causing your transgene to be transcribed. But your transgene landed in an orientation opposite of the gene it landed in. So the endogenous transcription generates antisense RNAs from your transgene which lead to silencing one way or another.