Standard Modular Cloning System¶

System Definition¶

Definition

Given a genetic alphabet \(\langle \Sigma, \sim \rangle\), a Modular Cloning System \(S\) is defined as a mathematical sequence

\[(M_l,\ V_l,\ e_l)_ {\ l\ \ge -1}\]

where:

\(M_l \subseteq \Sigma^\star \cup \Sigma^{(c)}\) is the set of modules of level \(l\)
\(V_l \subseteq \Sigma^{(c)}\) is the set of vectors of level \(l\)
\(e_l \subseteq E\) is the finite, non-empty set of asymmetric, Type IIS restriction enzymes of level \(l\)

Definition: \(k\)-cyclicity

A Modular Cloning System \((M_l, V_l, e_l)_ {l \ge -1}\) is said to be \(k\)-cyclic after a level \(\lambda\) if:

\[\begin{split}\begin{array}{ll} \exists k \in N^\star, & \\ \forall l \ge \lambda, & \\ & \begin{cases} M_{l+k} \subseteq M_l \\ V_{l+k} \subseteq V_l \\ e_{l+k} \subseteq e_l \end{cases} \end{array}\end{split}\]

Definition: \(\lambda\)-limit

A Modular Cloning System \((M_l, V_l, e_l)_ {l \ge -1}\) is said to be \(\lambda\)-limited if:

\[\forall l \ge \lambda, M_l = \emptyset, V_l = \emptyset, e_l = \emptyset\]

Modules¶

Definition

For a given level \(l\), \(M_l\) is defined as the set of modules \(m \in \Sigma^\star \cup \Sigma^{(c)}\) for which:

\[\begin{split}\begin{array}{l} \exists ! (S, n, k) \in e_l, \\ \exists ! (S^\prime, n^\prime, k^\prime) \in e_l, \\ \exists ! (s, s^\prime) \in S \times S^\prime, \\ \exists ! (x, y, o_5, o_3) \in (\Sigma^\star)^4, \\ \\ \quad \exists ! t \in \Sigma^\star, \left\{ \begin{array}{lll} \exists ! b \in \Sigma^\star,\ & m = (s \cdot x \cdot o_5 \cdot t \cdot o_3 \cdot y \cdot \widetilde{s^\prime} \cdot b)^{(c)}, & \text{ if } m \in \Sigma^{(c)}\\ \exists ! u, v \in (\Sigma^\star)^2, & m = u \cdot s \cdot x \cdot o_5 \cdot t \cdot o_3 \cdot y \cdot \widetilde{s^\prime} \cdot v, & \text{ if } m \not \in \Sigma^{(c)} \end{array} \right. \end{array}\end{split}\]

with:

\(|x| = n\)
\(|y| = n^\prime\)
\(|o_5| = abs(k)\)
\(|o_3| = abs(k^\prime)\)

Note

This decomposition is called the canonic module decomposition, where:

\(t\) is the target sequence of the module \(m\)
\(b\) is the backbone of the module \(m\) (if \(m\) is circular)
\(u\) and \(v\) are called the prefix and suffix of the module \(m\) (if \(m\) is not circular)
\(o_5\) and \(o_3\) are the upstream and downstream overhangs respectively.

Property

\(\forall \langle \Sigma, \sim \rangle\), \(\forall l \ge -1\), \(\forall e_l \subset E\):

\[M_l \text{ is a rational language }\]

Demonstration

Let there be a genetic alphabet \(\langle \Sigma, \sim \rangle\) and a Modular Cloning System \((M_l, V_l, e_l)_ {l \ge -1}\) over it.

\(\forall l \ge -1\), the regular expression:

\[\begin{split}\begin{array}l \bigcup_{\begin{array}l(S, n, k) \in e_l \\ (S\prime, n\prime, k\prime) \in e_l\end{array}} \Sigma^\star \cdot S \cdot \Sigma^n \cdot \Sigma^{abs(k)} \cdot \Sigma^\star \cdot \overline{(S | S^\prime)} \cdot \Sigma^\star \cdot \Sigma^{abs(k\prime)} \cdot \Sigma^{n\prime} \cdot \widetilde{\,S\prime\,} \cdot \Sigma^\star \\ \end{array}\end{split}\]

where:

\(\star\) is the Kleene star.
\(\widetilde{S} = \{\widetilde{s}, s \in S\}\) (reverse complementation operator).
\(\overline{S} = \{w \in \Sigma^\star, w \not \in S\}\) (complement operator).
\(S | S^\prime = S \cup S^\prime\) (alternation operator).

matches a sequence \(m \in \Sigma^\star \cup \Sigma^{(c)}\) if and only if \(m \in M_l\).

\(M_l\) is regular, so given Kleene’s Theorem, \(M_l\) is rational.

Vectors¶

Definition

For a given level \(l\), \(V_l\) is defined as the set of vectors \(v \in \Sigma^{(c)}\) for which:

with:

\(|x| = n\)
\(|y| = n^\prime\)
\(|o_5| = abs(k)\)
\(|o_3| = abs(k^\prime)\)
\(o_3 \ne o_5\)

Note

This decomposition is called the canonic vector decomposition, where:

\(p\) is the placeholder sequence of the vector \(v\)
\(b\) is the backbone of the vector \(v\)
\(o_3\) and \(o_5\) are the upstream and downstream overhangs respectively.

Overhangs¶

By definition, every valid level \(l\) module and vector only have a single canonic decomposition where they have unique \(o_5\) and \(o_3\) overhangs. As such, let the function \(up\) (resp. \(down\)) be defined as the function which:

to a module \(m\) associates the word \(o_5\) (resp. \(o_3\)) from its canonic module decomposition
to a vector \(v\) associates the word \(o_3\) (resp. \(o_5\)) from its canonic vector decomposition.

Standard Assembly¶

Definition: Standard MoClo Assembly

Given an assembly of level \(l\), where \(m_1, \dots, m_k \in M_l^k, v \in V_l\):

\[a:\quad m_1 + \dots + m_k \xrightarrow{\quad e_l \quad} A \subset (\Sigma^\star \cup \Sigma^{(c)})\]

and the partial order \(le\) over \(S = \{m_1, \dots, m_k\}\) defined as:

\[\begin{split}\begin{array}{l} \forall x, y \in S^2, \\ \quad x \le y \iff \begin{cases} x = y & \\ down(x) = up(y) & \text{ if } x \ne y\\ \exists z \in S \backslash \{x, y\}, down(x) = up(z), \ z \le y & \text{ if } x \ne y \text{ and } down(x) \ne up(y) \end{cases} \end{array}\end{split}\]

then a chain \(\langle S\prime, \le \rangle \subset \langle S, \le \rangle\) is an insert if:

\[\begin{split}\begin{cases} v \le min(S^\prime) \\ max(S^\prime) \le v \end{cases} \iff \begin{cases} down(v) = up(min(S^\prime)) \\ up(v) = down(max(S^\prime)) \end{cases}\end{split}\]

\(a\) is:

invalid if \(\langle S, \le \rangle\) is an antichain or \(\langle S, \ge \rangle\) has no insert.
valid if \(\langle S, \le \rangle\) has at least one insert.
ambiguous if \(\langle S, \le \rangle\) has more than one insert.
unambiguous if \(\langle S, \le \rangle\) has exactly one insert.
complete if \(\langle S, \le \rangle\) is an insert.

Corollary

If an assembly \(a\) is complete, then there exist a permutation \(\pi\) of \([\![1, k]\!]\) such that:

\[m_{\pi(1)} \le m_{\pi(2)} \le \dots \le m_{\pi(k-1)} \le m_{\pi(k)}\]

and:

\[\begin{split}\begin{array}{lll} up(m_{\pi(1)}) &=& down(v) \\ down(m_{\pi(k)}) &=& up(v) \end{array}\end{split}\]

Property: Uniqueness of the cohesive ends

If an assembly

\[m_1 + \dots + m_k \xrightarrow{\quad e_l \quad} A \subset (\Sigma^\star \cup \Sigma^{(c)})\]

is unambiguous and complete, then \(\forall i \in [\![1, k]\!]\),

\[\begin{split}\left\{ \begin{array}{llll} up(m_i) &\ne& down(m_i)& \\ up(m_i) &\ne& up(m_j), & j \in [\![1, k]\!] \backslash \{i\} \\ down(m_i) &\ne& down(m_j), & j \in [\![1, k]\!] \backslash \{i\} \\ \end{array} \right .\end{split}\]

Demonstration

Let there be an unambiguous complete assembly

\[a:\quad m_1 + \dots + m_k \xrightarrow{\quad e_l \quad} A\]

\(up(m_i) \ne down(m_i)\)

Let’s suppose that \(\exists i \in [\![1, k]\!]\) such that

\[up(m_i) = down(m_i)\]

then \(\langle \{m_1, \dots, m_k\} \backslash \{m_i\}, \le \rangle\) is also an insert, which cannot be since \(a\) is complete.
\(up(m_i) \ne up(m_j)\)

Let’s suppose that \(\exists (i, j) \in [\![1, k]\!]^2\) such that

\[up(m_i) = up(m_j)\]

Since the \(a\) is complete, there exists \(pi\) such that

\[m_{\pi(1)} \le m_{\pi(2)} \le \dots \le m_{\pi(k-1)} \le m_{\pi(k)}\]

and since \(a\) is unambiguous, \(\langle \{m_1, \dots, m_k\}, \le \rangle\) is the only insert.
\(down(m_i) \ne down(m_j)\)

TODO

Property: Uniqueness of the assembled plasmid

If an assembly

\[m_1 + \dots + m_k \xrightarrow{\quad e_l \quad} A \subset (\Sigma^\star \cup \Sigma^{(c)})\]

is unambiguous, then

\[A \cap \Sigma^{(c)} = \{p\}\]

with

\[p = \left( up(v) \cdot b \cdot up(m_{\pi(1)}) \cdot t_{\pi(1)} \cdot \, \dots \, \cdot up(m_{\pi(n)}) \cdot t_{\pi(n)} \right) ^{(c)}\]

(\(n \le k\), \(n = k\) if \(a\) is complete).

Demonstration

TODO