Standard Modular Cloning System

System Definition

Definition

Given a genetic alphabet \(\langle \Sigma, \sim \rangle\), a Modular Cloning System \(S\) is defined as a mathematical sequence

\[(M_l,\ V_l,\ e_l)_ {\ l\ \ge -1}\]

where:

  • \(M_l \subseteq \Sigma^\star \cup \Sigma^{(c)}\) is the set of modules of level \(l\)

  • \(V_l \subseteq \Sigma^{(c)}\) is the set of vectors of level \(l\)

  • \(e_l \subseteq E\) is the finite, non-empty set of asymmetric, Type IIS restriction enzymes of level \(l\)

Definition: \(k\)-cyclicity

A Modular Cloning System \((M_l, V_l, e_l)_ {l \ge -1}\) is said to be \(k\)-cyclic after a level \(\lambda\) if:

\[\begin{split}\begin{array}{ll} \exists k \in N^\star, & \\ \forall l \ge \lambda, & \\ & \begin{cases} M_{l+k} \subseteq M_l \\ V_{l+k} \subseteq V_l \\ e_{l+k} \subseteq e_l \end{cases} \end{array}\end{split}\]

Definition: \(\lambda\)-limit

A Modular Cloning System \((M_l, V_l, e_l)_ {l \ge -1}\) is said to be \(\lambda\)-limited if:

\[\forall l \ge \lambda, M_l = \emptyset, V_l = \emptyset, e_l = \emptyset\]

Modules

Definition

For a given level \(l\), \(M_l\) is defined as the set of modules \(m \in \Sigma^\star \cup \Sigma^{(c)}\) for which:

\[\begin{split}\begin{array}{l} \exists ! (S, n, k) \in e_l, \\ \exists ! (S^\prime, n^\prime, k^\prime) \in e_l, \\ \exists ! (s, s^\prime) \in S \times S^\prime, \\ \exists ! (x, y, o_5, o_3) \in (\Sigma^\star)^4, \\ \\ \quad \exists ! t \in \Sigma^\star, \left\{ \begin{array}{lll} \exists ! b \in \Sigma^\star,\ & m = (s \cdot x \cdot o_5 \cdot t \cdot o_3 \cdot y \cdot \widetilde{s^\prime} \cdot b)^{(c)}, & \text{ if } m \in \Sigma^{(c)}\\ \exists ! u, v \in (\Sigma^\star)^2, & m = u \cdot s \cdot x \cdot o_5 \cdot t \cdot o_3 \cdot y \cdot \widetilde{s^\prime} \cdot v, & \text{ if } m \not \in \Sigma^{(c)} \end{array} \right. \end{array}\end{split}\]

with:

  • \(|x| = n\)

  • \(|y| = n^\prime\)

  • \(|o_5| = abs(k)\)

  • \(|o_3| = abs(k^\prime)\)

Note

This decomposition is called the canonic module decomposition, where:

  • \(t\) is the target sequence of the module \(m\)

  • \(b\) is the backbone of the module \(m\) (if \(m\) is circular)

  • \(u\) and \(v\) are called the prefix and suffix of the module \(m\) (if \(m\) is not circular)

  • \(o_5\) and \(o_3\) are the upstream and downstream overhangs respectively.

Property

\(\forall \langle \Sigma, \sim \rangle\), \(\forall l \ge -1\), \(\forall e_l \subset E\):

\[M_l \text{ is a rational language }\]

Demonstration

Let there be a genetic alphabet \(\langle \Sigma, \sim \rangle\) and a Modular Cloning System \((M_l, V_l, e_l)_ {l \ge -1}\) over it.

\(\forall l \ge -1\), the regular expression:

\[\begin{split}\begin{array}l \bigcup_{\begin{array}l(S, n, k) \in e_l \\ (S\prime, n\prime, k\prime) \in e_l\end{array}} \Sigma^\star \cdot S \cdot \Sigma^n \cdot \Sigma^{abs(k)} \cdot \Sigma^\star \cdot \overline{(S | S^\prime)} \cdot \Sigma^\star \cdot \Sigma^{abs(k\prime)} \cdot \Sigma^{n\prime} \cdot \widetilde{\,S\prime\,} \cdot \Sigma^\star \\ \end{array}\end{split}\]

where:

matches a sequence \(m \in \Sigma^\star \cup \Sigma^{(c)}\) if and only if \(m \in M_l\).

\(M_l\) is regular, so given Kleene’s Theorem, \(M_l\) is rational.

Vectors

Definition

For a given level \(l\), \(V_l\) is defined as the set of vectors \(v \in \Sigma^{(c)}\) for which:

\[\begin{split}\begin{array}{l} \exists ! (S, n, k) \in e_l, \\ \exists ! (S^\prime, n^\prime, k^\prime) \in e_l, \\ \exists ! (s, s^\prime) \in S \times S^\prime, \\ \exists ! (x, y, o_5, o_3) \in (\Sigma^\star)^4, \\ \\ \quad \exists ! (b, p) \in (\Sigma^\star)^2, \exists ! b \in \Sigma^\star,\ v = (o_3 \cdot b \cdot o_5 \cdot y \cdot \widetilde{s} \cdot p \cdot s\prime \cdot x)^{(c)} \\ \end{array}\end{split}\]

with:

  • \(|x| = n\)

  • \(|y| = n^\prime\)

  • \(|o_5| = abs(k)\)

  • \(|o_3| = abs(k^\prime)\)

  • \(o_3 \ne o_5\)

Note

This decomposition is called the canonic vector decomposition, where:

  • \(p\) is the placeholder sequence of the vector \(v\)

  • \(b\) is the backbone of the vector \(v\)

  • \(o_3\) and \(o_5\) are the upstream and downstream overhangs respectively.

Overhangs

By definition, every valid level \(l\) module and vector only have a single canonic decomposition where they have unique \(o_5\) and \(o_3\) overhangs. As such, let the function \(up\) (resp. \(down\)) be defined as the function which:

  • to a module \(m\) associates the word \(o_5\) (resp. \(o_3\)) from its canonic module decomposition

  • to a vector \(v\) associates the word \(o_3\) (resp. \(o_5\)) from its canonic vector decomposition.

Standard Assembly

Definition: Standard MoClo Assembly

Given an assembly of level \(l\), where \(m_1, \dots, m_k \in M_l^k, v \in V_l\):

\[a:\quad m_1 + \dots + m_k \xrightarrow{\quad e_l \quad} A \subset (\Sigma^\star \cup \Sigma^{(c)})\]

and the partial order \(le\) over \(S = \{m_1, \dots, m_k\}\) defined as:

\[\begin{split}\begin{array}{l} \forall x, y \in S^2, \\ \quad x \le y \iff \begin{cases} x = y & \\ down(x) = up(y) & \text{ if } x \ne y\\ \exists z \in S \backslash \{x, y\}, down(x) = up(z), \ z \le y & \text{ if } x \ne y \text{ and } down(x) \ne up(y) \end{cases} \end{array}\end{split}\]

then a chain \(\langle S\prime, \le \rangle \subset \langle S, \le \rangle\) is an insert if:

\[\begin{split}\begin{cases} v \le min(S^\prime) \\ max(S^\prime) \le v \end{cases} \iff \begin{cases} down(v) = up(min(S^\prime)) \\ up(v) = down(max(S^\prime)) \end{cases}\end{split}\]

\(a\) is:

  • invalid if \(\langle S, \le \rangle\) is an antichain or \(\langle S, \ge \rangle\) has no insert.

  • valid if \(\langle S, \le \rangle\) has at least one insert.

  • ambiguous if \(\langle S, \le \rangle\) has more than one insert.

  • unambiguous if \(\langle S, \le \rangle\) has exactly one insert.

  • complete if \(\langle S, \le \rangle\) is an insert.

Corollary

If an assembly \(a\) is complete, then there exist a permutation \(\pi\) of \([\![1, k]\!]\) such that:

\[m_{\pi(1)} \le m_{\pi(2)} \le \dots \le m_{\pi(k-1)} \le m_{\pi(k)}\]

and:

\[\begin{split}\begin{array}{lll} up(m_{\pi(1)}) &=& down(v) \\ down(m_{\pi(k)}) &=& up(v) \end{array}\end{split}\]

Property: Uniqueness of the cohesive ends

If an assembly

\[m_1 + \dots + m_k \xrightarrow{\quad e_l \quad} A \subset (\Sigma^\star \cup \Sigma^{(c)})\]

is unambiguous and complete, then \(\forall i \in [\![1, k]\!]\),

\[\begin{split}\left\{ \begin{array}{llll} up(m_i) &\ne& down(m_i)& \\ up(m_i) &\ne& up(m_j), & j \in [\![1, k]\!] \backslash \{i\} \\ down(m_i) &\ne& down(m_j), & j \in [\![1, k]\!] \backslash \{i\} \\ \end{array} \right .\end{split}\]

Demonstration

Let there be an unambiguous complete assembly

\[a:\quad m_1 + \dots + m_k \xrightarrow{\quad e_l \quad} A\]
  • \(up(m_i) \ne down(m_i)\)

    Let’s suppose that \(\exists i \in [\![1, k]\!]\) such that

    \[up(m_i) = down(m_i)\]

    then \(\langle \{m_1, \dots, m_k\} \backslash \{m_i\}, \le \rangle\) is also an insert, which cannot be since \(a\) is complete.

  • \(up(m_i) \ne up(m_j)\)

    Let’s suppose that \(\exists (i, j) \in [\![1, k]\!]^2\) such that

    \[up(m_i) = up(m_j)\]

    Since the \(a\) is complete, there exists \(pi\) such that

    \[m_{\pi(1)} \le m_{\pi(2)} \le \dots \le m_{\pi(k-1)} \le m_{\pi(k)}\]

    and since \(a\) is unambiguous, \(\langle \{m_1, \dots, m_k\}, \le \rangle\) is the only insert.

  • \(down(m_i) \ne down(m_j)\)

    TODO

Property: Uniqueness of the assembled plasmid

If an assembly

\[m_1 + \dots + m_k \xrightarrow{\quad e_l \quad} A \subset (\Sigma^\star \cup \Sigma^{(c)})\]

is unambiguous, then

\[A \cap \Sigma^{(c)} = \{p\}\]

with

\[p = \left( up(v) \cdot b \cdot up(m_{\pi(1)}) \cdot t_{\pi(1)} \cdot \, \dots \, \cdot up(m_{\pi(n)}) \cdot t_{\pi(n)} \right) ^{(c)}\]

(\(n \le k\), \(n = k\) if \(a\) is complete).

Demonstration

TODO