(coasim SNP genotypes)
This module contains functions for manipulating SNP (unphased) genotypes.
(coasim SNP haplotypes)
This module contains functions for manipulating SNP haplotypes.
(coasim batch)
This module contains functions for running batchs of simulations. Although most of the functionallity in this module can be easily implemented using Scheme's control-flow structures, the functions here make batch running just a tad simpler.
(coasim disease-modelling)
This module contains functions for disease models, specifically for splitting simulated sequences in diseased and healthy individuals based on their haplotypes or genotypes.
(coasim io)
This module contains functions for reading and writing coasim data.
(coasim markers)
This module contains functions manipulating sorted lists of markers.
(coasim rand)
This module contains functions making random marker configurations.
(coasim statistics)
This module contains functions for calculating various statistics on the ARG and on lists of haplotypes.
(coasim stepwise)
This module contains a step-wise mutation model for micro-satellite markers.
Checks if a node contains ancestral material in a given point.
(ancestral? node point)
(define coa-times '()) ; times for coalescent events at point 0.5 (define (coa-cb n k) (if (ancestral? n 0.5) (set! coa-times (cons (event-time n) coa-times)))) (simulate markers no-leaves :rho 400 :coalescence-callback coa-cb)
Checks if a node contains ancestral material in a given point.
Makes a bottleneck epoch.
(bottleneck population scale-fraction begin-time end-time)
(bottleneck 0 0.1 0.9 1.22)
Specify a bottleneck the simulation process will pass through.
The end-time parameter is optional, if left out the bottleneck continues to infinity (or as long as the population exists).
Returns the children of a node.
(children node)
(define (collect-event-times i)
(letrec ((collect-et-r
(lambda (root point)
(if (not (ancestral? root point))
'()
(cond ((leaf-node? root)
(list (event-time root)))
((coalescent-node? root)
(let* ((cs (children root))
(left (collect-et-r (car cs) point))
(right (collect-et-r (cadr cs) point)))
(cons (event-time root) (append left right))))
((or (recombination-node? root)
(gene-conversion-node? root))
(cons (event-time root)
(collect-et-r (car (children root))
point))))))))
(collect-et-r (root i) (interval-start i))))Returns the children of a node.
A predicate recognising coalescent nodes.
(coalescent-node? node)
(if (coalescent-node? n) (event-time n))
Returns #t if `node` is a coalescent node, #f otherwise.
Creates a custom marker type
(custom-marker position ancestral-value mutation-function)
(define (step-ms-marker pos theta)
(let* ((msec (cdr (gettimeofday)))
(random-state (seed->random-state msec))
(waiting-time
(lambda ()
(let ((mean (/ 2 theta)));mean is 1/i where the intensity i is theta/2
(* mean (random:exp random-state)))))
(mutate-to
(lambda (parent-allele)
(if (< (random 1.0 random-state) 0.5)
(- parent-allele 1)
(+ parent-allele 1))))
(mutate
(lambda (parent child parent-allele)
(let loop ((allele parent-allele)
(time-left
(- (- (event-time parent) (event-time child))
(waiting-time))))
(if (< time-left 0)
allele
(let ((new-allele (mutate-to allele))
(next-time (waiting-time)))
(loop new-allele (- time-left next-time))))))))
(custom-marker pos 0 mutate))) Creates a custom marker at position `position', with an ancestral value given by `ancestral-value' (either an integer or a thunk returning an integer), and a mutation function `mutate', that takes arguments `parent-node' and `child-node' of an edge to mutate, and the allele at the parent of the edge.
Returns the destination population of a migration.
(destination-population migration-node)
(define mig-destinations
(fold-nodes ARG
(lambda (node lst)
(if (migration-node? node)
(cons (destination-population node) lst)
lst))
'())) Returns the destination population of a migration event (back in time, that is, the population the lineage belongs to above the migration event).
Returns the time of the event represented by a node.
(event-time node)
(define coa-times '()) (define (coa-cb n k) (set! coa-times (cons (event-time n) coa-times))) (define rc-times '()) (define (rc-cb n1 n2) (set! rc-times (cons (event-time n1) rc-times))) (simulate markers no-leaves :rho 400 :coalescence-callback coa-cb :recombination-callback rc-cb)
Returns the time of the event represented by a node.
Calculate a value from all ARG nodes.
(fold-nodes arg f init)
(define no-nodes (fold-nodes arg (lambda (n count) (+ count 1)) 0)) (define no-leaves (fold-nodes arg (lambda (n count) (if (leaf-node? n) (+ count 1) count)) 0))
Calculates a value from all ARG nodes. Call function `f' on each node in the ARG, accumulating a value that is the result of the fold. For each node, `f' is called with the node as its first argument and the value calculated so far as its second argument; the second argument is initially `init'.
Returns the end point of a gene conversion.
(gene-conversion-end gene-conversion-node)
(define gc-end '()) (define (gc-cb n1 n2 k) (set! gc-end (cons (gene-conversion-end n1) gc-end))) (simulate markers no-leaves :gamma 10 :Q 0.2 :geneconversion-callback gc-cb)
Returns the end point of a gene conversion.
A predicate recognising gene conversion nodes.
(gene-conversion-node? node)
(if (gene-conversion-node? n) (event-time n))
Returns #t if `node` is a gene conversion node, #f otherwise.
Returns the start point of a gene conversion.
(gene-conversion-start gene-conversion-node)
(define gc-start '()) (define (gc-cb n1 n2 k) (set! gc-start (cons (gene-conversion-start n1) gc-start))) (simulate markers no-leaves :gamma 10 :Q 0.2 :geneconversion-callback gc-cb)
Returns the start point of a gene conversion.
Makes a growth epoch.
(growth population beta begin-time end-time)
(growth 0 10 0.9 1.22)
Specify a period of exponetial growth.
The end-time parameter is optional, if left out the bottleneck continues to infinity (or as long as the population exists).
Returns tree local to an interval.
(interval->tree interval)
(define markers (make-random-snp-markers 10 0.1 0.9)) (define intervals (intervals (simulate markers 100 :rho 400))) (define trees (map interval->tree intervals))
Returns the tree local to an interval.
Returns the end position of an interval.
(interval-end interval)
(define markers (make-random-snp-markers 10 0.1 0.9)) (define intervals (let ((arg (simulate markers 100 :rho 400))) (intervals arg))) (map interval-end intervals)
Returns the end position of an interval.
Returns the start position of an interval.
(interval-start interval)
(define p (arg-parameters rho Q G beta)) (define markers (make-random-snp-markers 10 0.1 0.9)) (define intervals (let ((arg (simulate p markers 100))) (intervals arg))) (map interval-start intervals)
Returns the start position of an interval.
Returns the intervals sharing genealogy in the ARG as a list.
(intervals arg)
(define markers (make-random-snp-markers 10 0.1 0.9)) (define intervals (let ((arg (simulate markers 100 :rho 400))) (intervals arg)))
Returns the intervals sharing genealogy in the ARG as a list. Only intervals containing markers are returned.
A predicate recognising leaf nodes.
(leaf-node? node)
(if (leaf-node? n) 0.0 (event-time n))
Returns #t if `node` is a leaf node, #f otherwise.
Returns the local trees, i.e. the trees for intervals sharing genealogy in the ARG as a list.
(local-trees arg)
(define p (arg-parameters rho Q G beta)) (define markers (make-random-snp-markers 10 0.1 0.9)) (define trees (let ((arg (simulate markers 100 :rho 400))) (local-trees arg)))
Returns the local trees, i.e. the trees for intervals sharing genealogy in the ARG as a list. Only trees containing markers are returned.
Makes a migration epoch.
(migration source-population destination-population rate begin-time end-time)
(migration 0 1 0.1 0.9 1.22)
Specify a period of migration from population `source-population' to `destination-population' with scaled migration rate `rate'.
A predicate recognising migration nodes.
(migration-node? node)
(if (migration-node? n) (event-time n))
Returns #t if `node` is a migration node, #f otherwise.
Remember: the ARG will only contain migration nodes if it was simulated with option :keep-migration-events set to #t.
Creates a micro-satellite marker on the simulated region.
(ms-marker position theta K)
(ms-marker 0.5 0.2 10)
Creates a micro-satellite marker on the simulated region. The first parameter determines the position (between 0 and 1) along the region, the second the mutation rate of the marker, and the third the size of the alphabet for the marker (in the K allele model).
A predicate that evalutes to true for micro-satellite markers only.
(ms-marker? marker)
(ms-marker? marker)
A predicate that evalutes to true for micro-satellite markers and false for all other types.
Merges two sub-populations.
(population-merge merge-time pop1 pop2 ...)
(population-merge 0.9 0 1)
Sets up the merge time for two or more sub-populations. The effect of the merge is that all individuals in the populations are moved to the first population.
Returns the position of a marker.
(position marker)
(position marker)
Returns the position of a marker.
A predicate recognising recombination nodes.
(recombination-node? node)
(if (recombination-node? n) (event-time n))
Returns #t if `node` is a recombination node, #f otherwise.
Returns the recombination point of a recombination node.
(recombination-point recombination-node)
(define rc-points '()) (define (rc-cb n1 n2 k) (set! rc-points (cons (recombination-point n1) rc-points))) (simulate markers no-leaves :rho 400 :recombination-callback rc-cb)
Returns the recombination point of a recombination node.
Returns the root of the local tree of an interval.
(root interval-or-tree)
(use-modules (coasim rand)) (define markers (make-random-snp-markers 10 0.1 0.9)) (define trees (local-trees (simulate markers 100 :rho 400))) (map root trees)
Returns the root node of the local tree of an interval. The argument to the function can be either a local interval, as returned by the intervals function, or a local tree, as returned by the local-trees function.
Returns the simulated sequences of an ARG as a list of lists.
(sequences arg)
(define markers (make-random-snp-markers 10 0.1 0.9)) (define haplotypes (let ((arg (simulate markers 100 :rho 400))) (sequences arg)))
Returns the simulated sequences of an ARG as a list of lists.
Changes the position of a marker.
(set-position! marker new-position)
(set-position! marker 0.5)
Changes the position of a marker. In general, you will not need this function, but it is useful when rescaling the interval to implement, e.g. variable recombination rate.
Simulate an ARG and corresponding sequences.
(simulate marker-list sim-spec . additional-keyword-parameters)
(define markers (make-random-snp-markers 10 0.1 0.9)) (define arg (simulate markers 100 :rho 400 :beta 10)) (define coa-times '()) (define (coa-cb n k) (set! coa-times (cons (event-time n) coa-times))) (simulate markers 10 :coalescence-callback coa-cb :rho 400 :beta 10) (display "coalescence times:\n") (map (lambda (t) (display t)(newline)) coa-times) (newline)
Simulate an ARG and corresponding sequences, based on ARG parameters, a list of markers, and a simulation specification. The specification is either a number--the number of samples to simulate -- or a population structure specification -- see the CoaSim Guile Manual for details.
The building of the ARG is affected by the following paramters, that can be set using keyword arguments:
For specifying migration rates between subpopulations (see the CoaSim Guile Manual for details), the keyword parameter :migrations is used. This parameter takes a list of migration specifications.
For fine-monitoring of the simulation, callback functions can be given as key-word arguments. The supported callbacks are:
The following additional keyword arguments are available:
Simulate a list of sequences.
(simulate-sequences markers no-leaves . additional-simulation-parameters)
(define markers (make-random-snp-markers 10 0 1)) (define seqs (simulate-sequences markers 10 :rho 400))
Simulate a list of sequences. This function is just a short-cut for `simulate' followed by `sequences':
(define (simulate-sequences . args)
(sequences (apply simulate args)))
Creates a SNP marker on the simulated region.
(snp-marker position low-freq high-freq)
(snp-marker 0.5 0.1 0.9)
Creates a SNP marker on the simulated region. The first parameter determines the position (between 0 and 1) along the region, the second the lowest frequency accepted for the mutation type, and the third the highest frequency accepted for the mutation type.
A predicate that evalutes to true for SNP markers only.
(snp-marker? marker)
(snp-marker? marker)
A predicate that evalutes to true for SNP markers and false for all other types.
Returns the source population of a migration.
(source-population migration-node)
(define mig-sources
(fold-nodes ARG
(lambda (node lst)
(if (migration-node? node)
(cons (source-population node) lst)
lst))
'())) Returns the source population of a migration event (back in time, that is, the population the lineage belongs to below the migration event).
Returns the total tree branch length of the local tree of an interval.
(total-branch-length interval-or-tree)
(use-modules (coasim rand)) (define markers (make-random-snp-markers 10 0.1 0.9)) (define intervals (intervals (simulate markers 100 :rho 400))) (map total-branch-length intervals)
Returns the total tree branch length of the local tree of an interval. The argument to the function can be either a local interval, as returned by the intervals function, or a local tree, as returned by the local-trees function.
Creates a trait marker on the simulated region.
(trait-marker position low-freq high-freq)
(trait-marker 0.5 0.18 0.22)
Creates a trait marker on the simulated region. The first parameter determines the position (between 0 and 1) along the region, the second the lowest frequency accepted for the mutation type, and the third the highest frequency accepted for the mutation type.
A predicate that evalutes to true for trait markers only.
(trait-marker? marker)
(trait-marker? marker)
A predicate that evalutes to true for trait markers and false for all other types.
Checks if a node contains trapped material in a given point.
(trapped? node point)
(define coa-times '()) ; times for coalescent events at point 0.5 (define (coa-cb n k) (if (or (ancestral? n 0.5) (trapped? n 0.5))(set! coa-times (cons (event-time n) coa-times)))) (simulate markers no-leaves :rho 400 :coalescence-callback coa-cb)
Checks if a node contains trapped material in a given point.
Returns interval a local tree covers.
(tree->interval tree)
(define markers (make-random-snp-markers 10 0.1 0.9)) (define intervals (let ((arg (simulate markers 100 :rho 400))) (intervals arg))) (define trees (map interval->tree intervals)) (define intervals2 (map tree->interval trees))
Returns interval a local tree covers. The function is the reverse of interval->tree.
Returns the height of the local tree of an interval.
(tree-height interval-or-tree)
(use-modules (coasim rand)) (define markers (make-random-snp-markers 10 0.1 0.9)) (define intervals (local-trees (simulate markers 100 :rho 400))) (map tree-height intervals)
Returns the height of the local tree of an interval. The argument to the function can be either a local interval, as returned by the intervals function, or a local tree, as returned by the local-trees function.
This module contains functions for manipulating SNP (unphased) genotypes.
Translate a list of SNP haplotypes into a list of SNP genotypes.
(haplotypes->genotypes haplotype-list)
(use-modules (coasim rand) (coasim SNP genotypes)) (define haplotypes (simulate-sequences (make-random-snp-markers 4 0 1) 8)) (display haplotypes)(newline) (define genotypes (haplotypes->genotypes haplotypes)) (display genotypes)(newline)
Translate a list containing an even number of SNP haplotypes into a list of SNP genotypes. The haplotypes are paired and the alleles are translated such that homozygote 00 becomes 0, homozygote 11 becomes 1, and heterozygotes become 2.
Split a list of genotypes into cases and controls.
(split-in-cases-controls-on-marker genotypes trait-idx)
(define haplotypes (simulate-sequences markers 100))
(define genotypes (haplotypes->genotypes haplotypes))
(define cases-and-controls (split-in-cases-controls-on-marker genotypes trait-idx
:disease-model 'recessive))
(define cases (car cases-and-controls))
(define controls (cadr cases-and-controls))Split a dataset into cases and controls, based on the value at trait-idx.
By default the alleles at the trait marker is removed from the resulting lists; an optional parameter, :remove-trait, if set to #f, will prevent removal of the trait marker.
The following keyword arguments can be used to control the split:
Split a list of genotypes into cases and controls.
(split-in-cases-controls-on-marker/phased haplotypes trait-idx)
(define haplotypes (simulate-sequences markers 100))
(define cases-and-controls (split-in-cases-controls-on-marker haplotypes trait-idx
:disease-model 'recessive))
(define cases (car cases-and-controls))
(define controls (cadr cases-and-controls))Split a dataset into cases and controls, based on the value at trait-idx. The dataset consist of a list of haplotypes, with an even number of haplotypes; the haplotypes will be considered pairwise and split into cases/controls based on their genotype at the trait marker, rather than the haplotype. The resulting cases will be the pairs of haplotypes considered cases and the controls will be the remaining pairs.
By default the alleles at the trait marker is removed from the resulting lists; an optional parameter, :remove-trait, if set to #f, will prevent removal of the trait marker.
The following keyword arguments can be used to control the split:
This module contains functions for manipulating SNP haplotypes.
Split a list of haplotypes into cases and controls.
(split-in-cases-controls-on-marker haplotypes trait-idx)
(define haplotypes (simulate-sequences markers 100)) (define cases-and-controls (split-in-cases-controls-on-marker haplotypes trait-idx)) (define cases (car cases-and-controls)) (define controls (cadr cases-and-controls))
Split a dataset into cases and controls, based on the value at trait-idx.
By default the alleles at the trait marker is removed from the resulting lists; an optional parameter, :remove-trait, if set to #f, will prevent removal of the trait marker.
The following keyword arguments can be used to control the split:
This module contains functions for running batchs of simulations. Although most of the functionallity in this module can be easily implemented using Scheme's control-flow structures, the functions here make batch running just a tad simpler.
Call a block of code a certain number of times, collecting the result using a combination function.
(fold n comb init code)
(let* ((no-iterations 10000)
(branch-sum
(fold no-iterations (lambda (val sum) (+ val sum)) 0
(let* ((m (snp-marker 0.5 0 1))
(arg (simulate (list m) 10))
(tree (car (local-trees arg)))
(branch-length (total-branch-length tree)))
branch-length))))
(display (/ branch-sum no-iterations))(newline))Execute the code `code' `n' number of times, combining the result of `code' using the `comb' function. The `comb' is called with two arguments in each iteration, the first is the result of having run `code', the second is the result of the previous combination, initially the value `init' passed to `fold'.
Call a block of code a certain number of times.
(repeat n c)
(use-modules (coasim batch))
(repeat 20
(let* ((m (snp-marker 0.5 0 1))
(arg (simulate (list m) 10 :random-seed 10))
(tree (car (local-trees arg))))
(display tree)))
(newline) Execute the code `c' `n' number of times.
Call a block of code until a predicate evaluates to false.
(repeat-while p c)
(use-modules (ice-9 format) (coasim batch))
(repeat-while (lambda (branch-length) (< branch-length 5))
(let* ((m (snp-marker 0.5 0 1))
(arg (simulate (list m) 10))
(tree (car (local-trees arg)))
(branch-length (total-branch-length tree)))
(format #t "~4f " branch-length)
branch-length))
(newline)Execute the code `c' until the predicate `p', evaluated on the result of `c' evaluates to false.
Call a block of code a certain number of times.
(repeat-with-index (i n) c) (repeat-with-index (i start stop) c) (repeat-with-index (i start stop step) c)
(use-modules (coasim batch))
(repeat-with-index (i 10)
(let* ((markers (make-random-snp-markers 10 0 1))
(seqs (simulate-sequences markers 10))
(pos-file (string-append "positions." (number->string i) ".txt"))
(seq-file (string-append "sequences." (number->string i) ".txt")))
(call-with-output-file pos-file (marker-positions-printer markers))
(call-with-output-file seq-file (sequences-printer seqs))))
(repeat-with-index (i 2 12)
(let* ((markers (make-random-snp-markers 10 0 1))
(seqs (simulate-sequences markers 10))
(pos-file (string-append "positions." (number->string i) ".txt"))
(seq-file (string-append "sequences." (number->string i) ".txt")))
(call-with-output-file pos-file (marker-positions-printer markers))
(call-with-output-file seq-file (sequences-printer seqs))))
(repeat-with-index (i 2 20 2)
(let* ((markers (make-random-snp-markers 10 0 1))
(seqs (simulate-sequences markers 10))
(pos-file (string-append "positions." (number->string i) ".txt"))
(seq-file (string-append "sequences." (number->string i) ".txt")))
(call-with-output-file pos-file (marker-positions-printer markers))
(call-with-output-file seq-file (sequences-printer seqs)))) Execute the code `c' a number of times, making the index variable `i' visible to the code in `c'. Comes in three variants, a simple iteration from 1 to `n', an iteration from `start' to `stop', and an iteration from `start' to `stop' in jumps of `step'.
Call a block of code a certain number of times, collecting the results in a list.
(tabulate n c)
(use-modules (coasim batch))
(let* ((no-iterations 10000)
(branch-lengths
(tabulate no-iterations
(let* ((m (snp-marker 0.5 0 1))
(arg (simulate (list m) 10))
(tree (car (local-trees arg)))
(branch-length (total-branch-length tree)))
branch-length))))
(display (/ (apply + branch-lengths) no-iterations))(newline)) Execute the code `c' `n' number of times, collecting the results of evaluating `c' in a list that is returned as the result of the tabulation.
This module contains functions for disease models, specifically for splitting simulated sequences in diseased and healthy individuals based on their haplotypes or genotypes.
Combine haplotypes pair-wise to construct (phased) genotypes.
(pair-haplotypes haplotypes)
(use-modules (coasim rand) (coasim disease-modelling)) (define sequences (simulate-sequences (make-random-snp-markers 10 0 1) 10)) (define (print-sequences seqs) (for-each (lambda (x) (display x)(newline)) seqs)) (newline) (print-sequences sequences)(newline) (print-sequences (pair-haplotypes sequences))(newline) (print-sequences (unpair-genotypes (pair-haplotypes sequences)))(newline) (newline)
Translate a list (of even length) of haplotypes into a list of (phased) genotypes in the form of lists of pairs.
Projects a list of allels to a subset for a function call.
(project-function f indices)
(use-modules (coasim disease-modelling)) (define is-case? (project-function (lambda (a1 a3) (= a1 1) (= a3 0)) '(1 3)))
Translates a function that takes a subset of allels as input, into a function that takes the full list of alleles.
If function f takes, say, two alleles as input, and we want to call it on index 1 and 3, we can use function (project-function f '(1 3)) that will work on the full allele list by calling f with allele 1 and 3.
Calculates quantitative values for haplotypes based on selected alleles.
(qtl-on-markers sequences marker-indices f)
(use-modules (coasim disease-modelling))
(let ((qtl-value (lambda (a2 a4) (+ (* 2 a2) a4)))))
(qtl-on-markers seqs '(2 4) qtl-value)))Calculate a quantitative value for each sequence in sequences by calling f with the alleles at marker-indices for each sequence.
Returns a list of lists, where the car is the sequence and the cadr is the calculated value.
By default the alleles at the marker indices are removed from the resulting lists; an optional parameter, :remove-traits, if set to #f, will prevent removal of the marker alleles.
Removes the alleles of the marker at `idx' from the sequences.
(remove-allele sequences idx)
(use-modules (coasim disease-modelling)) (remove-allele seqs trait-idx)))
Removes the alleles at `idx' from the sequences; useful for example for removing a trait marker from a dataset.
Removes the alleles of the markers at `indices' from the sequences.
(remove-alleles sequences indices)
(use-modules (coasim disease-modelling)) (remove-allele seqs trait-indices)
Removes the alleles at `inidices' from the sequences; useful for example for removing the trait markers from a dataset.
Split a list of sequences into cases and controls.
(split-in-cases-controls sequences is-case?)
(use-modules (coasim disease-modelling))
(let* ((is-mutant? (lambda (a) (= 1 a)))
(allele (lambda (h idx) (list-ref h idx)))
(is-case? (lambda (h) (and (is-mutant? (allele h 2))
(is-mutant? (allele h 4))))))
(split-in-cases-controls seqs is-case?))Split a dataset into cases and controls, based on the predicate is-case?. If a sequence satisfy the is-case? predicate, the sequence is considered a case, otherwise a control.
Split a list of sequences into cases and controls.
(split-in-cases-controls-on-marker sequences marker-idx is-case?)
(use-modules (coasim disease-modelling))
(let ((is-case? (lambda (a) (= 1 a))))
(split-in-cases-controls-on-marker seqs marker-idx is-case?)))Split a dataset into cases and controls, based on the value of the marker at index marker-idx. If the value at that index satisfy the is-case? predicate, the sequence is considered a case, otherwise a control.
By default the alleles at the marker is removed from the resulting lists; an optional parameter, :remove-trait, if set to #f, will prevent removal of the marker alleles.
Split a list of sequences into cases and controls.
(split-in-cases-controls-on-marker/genotype sequences marker-idx is-case?)
(use-modules (coasim disease-modelling))
(let ((is-case? (lambda (p) (= 2 (apply + p)))))
(split-in-cases-controls-on-marker seqs marker-idx is-case?)))Split a dataset into cases and controls, based on the value at trait-idx. The dataset consist of a list of haplotypes, with an even number of haplotypes; the haplotypes will be considered pairwise and split into cases/controls based on their genotype at the trait marker, rather than the haplotype. The resulting cases will be the pairs of haplotypes considered cases and the controls will be the remaining pairs.
By default the alleles at the marker is removed from the resulting lists; an optional parameter, :remove-trait, if set to #f, will prevent removal of the marker alleles.
Split a list of sequences into cases and controls.
(split-in-cases-controls-on-markers sequences marker-indices is-case?)
(use-modules (coasim disease-modelling))
(let ((is-case? (lambda (a2 a4) (and (= 1 a2) (= 1 a4)))))
(split-in-cases-controls-on-markers seqs '(2 4) is-case?)))Split a dataset into cases and controls, based on the values of the alleles at indices marker-indices. If thes value at those indices satisfy the is-case? predicate, the sequence is considered a case, otherwise a control.
By default the alleles at the marker indices are removed from the resulting lists; an optional parameter, :remove-traits, if set to #f, will prevent removal of the marker alleles.
Split a list of sequences into cases and controls.
(split-in-cases-controls-on-markers/genotype sequences marker-indices is-case?)
(use-modules (coasim disease-modelling))
(let ((is-case? (lambda (g2 g4)
(let ((a2-1 (car g2)) (a2-2 (cadr g2))
(a4-1 (car g4)) (a4-2 (cadr g4)))
(or (> (+ a2-1 a2-2) 0) (= 1 a4-1 a4-2))))))
(split-in-cases-controls-on-markers/genotype seqs '(2 4) is-case?))Split a dataset into cases and controls, based on the values of the alleles at indices marker-indices. If thes value at those indices satisfy the is-case? predicate, the sequence is considered a case, otherwise a control. The dataset consist of a list of haplotypes, with an even number of haplotypes; the haplotypes will be considered pairwise and split into cases/controls based on their genotype at the trait marker, rather than the haplotype. The resulting cases will be the pairs of haplotypes considered cases and the controls will be the remaining pairs.
By default the alleles at the marker indices are removed from the resulting lists; an optional parameter, :remove-traits, if set to #f, will prevent removal of the marker alleles.
Split a list of sequences with associated probabilities based on the probability.
(split-on-probability sequence-probability-list)
(use-modules (coasim disease-modelling))
(let* ((qtl-value (lambda (a2 a4) (+ (* .2 a2) (* .1 a4))))
(qtl-seqs (qtl-on-markers seqs '(2 4) qtl-value))
(cases-controls (split-on-probability qtl-seqs))
(cases (car cases-controls))
(controls (cadr cases-controls)))
...)Split a list of sequences with associated probabilities.
The input is a list of lists, where the first element in each list is a sequence and the second is a probability. The sequences are separated into two output lists; for pair '(seq p), seq is put in the first list with probability p and in the second with probability 1-q.
Split a list of sequences with associated quantitative values based on a threshold for the value.
(split-on-threshold sequence-qtl-list threshold)
(use-modules (coasim disease-modelling))
(let* ((qtl-value (lambda (a2 a4) (+ (* 2 a2) a4)))
(qtl-seqs (qtl-on-markers seqs '(2 4) qtl-value))
(cases-controls (split-on-threshold qtl-seqs 1.5))
(cases (car cases-controls))
(controls (cadr cases-controls)))
...)Split a list of sequences with associated quantitative values based on a threshold for the value.
The input is a list of lists, where the first element in each list is a sequence and the second is a quantitative value. This is the kind of values that qtl-on-markers return.
The result is a list where the car contains the sequences where the associated value was below the threshold and where the cadr contains the sequences where the associated value was equal to or above the threshold.
Translates a table into a function usable by e.g. the qtl-on-markers function.
(table->function table)
(use-modules (coasim disease-modelling))
(define f-hash
(let ((h (make-hash-table 4)))
(hash-create-handle! h '(0 0) 0)
(hash-create-handle! h '(0 1) 0)
(hash-create-handle! h '(1 0) .1)
(hash-create-handle! h '(1 1) .5)
(table->function h)))
(define f-alist
(let ((t
(acons '(0 0) 0
(acons '(0 1) 0
(acons '(1 0) .1
(acons '(1 1) .5
'()))))))
(table->function t)))
(define f-alist
(let ((t
(acons '(1 0) .1
(acons '(1 1) .5
'()))))))
(table->function t)))
(define f-alist
(let ((t
(acons '(1 0) .1
(acons '(1 1) .5
'()))))))
(table->function t 0.1)))Translates a table, either a hash table or an associative list, into a function that can be used with qtl-on-markers or split-in-cases-controls-on-markers.
An optional second parameter can be used to set the default value for the function to return when the table does not contain whatever value the function is called with. By default this value is 0.0.
Translates a list of (phased) genotypes into haplotypes.
(unpair-genotypes genotypes)
(use-modules (coasim rand) (coasim disease-modelling)) (define sequences (simulate-sequences (make-random-snp-markers 10 0 1) 10)) (define (print-sequences seqs) (for-each (lambda (x) (display x)(newline)) seqs)) (newline) (print-sequences sequences)(newline) (print-sequences (pair-haplotypes sequences))(newline) (print-sequences (unpair-genotypes (pair-haplotypes sequences)))(newline) (newline)
Translate a list (phased) genotypes in the form of lists of pairs into a list of haplotypes.
This module contains functions for reading and writing coasim data.
Convenience function for printing marker-positions.
(marker-positions-port port)
(map (position-port port) list-of-marker-positions)
A convenience function, based on `print-marker-positions', that, given a list a port, gives a function with one argument, a position, that when applied writes the marker-positions to the port.
The function is useful for mapping over lists of marker-positions.
Convenience function for printing marker-positions.
(marker-positions-printer marker-positions)
(call-with-output-file "marker-positions.txt" (marker-positions-printer marker-positions))
A convenience function, based on `print-marker-positions', that, given a list of marker-positions, gives a function with one argument, a port, that when applied writes the marker-positions to the port.
The function is useful for calls to `call-with-output-file'.
Convenience function for printing positions.
(positions-port port)
(map (position-port port) list-of-positions)
A convenience function, based on `print-positions', that, given a list a port, gives a function with one argument, a position, that when applied writes the positions to the port.
The function is useful for mapping over lists of positions.
Convenience function for printing positions.
(positions-printer positions)
(call-with-output-file "positions.txt" (positions-printer positions))
A convenience function, based on `print-positions', that, given a list of positions, gives a function with one argument, a port, that when applied writes the positions to the port.
The function is useful for calls to `call-with-output-file'.
Print a list of positions from markers.
(print-marker-positions port markers)
(print-marker-positions (current-output-port) markers)
This function prints the positions of the markers in the list `markers' to the output port `port'.
Print a list of positions.
(print-positions port positions)
(print-positions (current-output-port) positions)
This function prints the positions (numbers) in the list `positions' to the output port `port'.
Print a list of sequences.
(print-sequences port sequences)
(print-sequences (current-output-port) sequences)
This function prints the sequences in the list `sequences' to the output port `port'.
Convenience function for printing sequences.
(sequences-port port)
(map (sequence-port port) list-of-sequences)
A convenience function, based on `print-sequences', that, given a list a port, gives a function with one argument, a sequence, that when applied writes the sequences to the port.
The function is useful for mapping over lists of sequences.
Convenience function for printing sequences.
(sequences-printer sequences)
(call-with-output-file "sequences.txt" (sequences-printer sequences))
A convenience function, based on `print-sequences', that, given a list of sequences, gives a function with one argument, a port, that when applied writes the sequences to the port.
The function is useful for calls to `call-with-output-file'.
This module contains functions manipulating sorted lists of markers.
Inserts a marker in a sorted list of markers.
(insert-sorted sorted-marker-list marker)
(insert-sorted sorted-marker-list marker)
This function inserts a marker into a sorted list of markers and return the resulting list.
Inserts a marker in a sorted list of markers..
(insert-sorted-idx marker-list marker)
(define snp-markers (make-random-snp-markers 10 0.1 0.9)) (define disease-marker (car (make-random-trait-markers 1 0.18 0.22))) (define markers-and-index (insert-sorted-idx snp-markers disease-marker))
This function inserts a marker into a sorted list of markers and return a list who's first element is the resulting list and who's second element is the index the marker got in the new list.
Merge two or more lists of sorted markers.
(merge-markers . list-of-marker-lists)
(merge-markers marker-list-1 marker-list-2 marker-list-3)
Merge two or more lists of sorted markers.
Rescales marker positions to simulate variable recombination rate.
(rescale-markers! markers recombination-rates)
(use-modules (coasim markers)) (define recomb-rates '((0.5 40) (0.25 400) (0.25 40))) (define scaled (rescale-markers! markers recomb-rates))
Rescales the positions of the markers in `markers' according to `recombination-rates', a list of pairs where the first element is a lenght (of the 0-1 interval) and the second is the recombination rate over that range. (The actual value of the rate is not important, only the relative differences between the rates in the list).
Similar to the function `rescale-positions', except that for this function the markers are moved according to the rescaling.
Rescales marker positions to simulate variable recombination rate.
(rescale-positions positions recombination-rates)
(use-modules (coasim markers)) (define positions '(0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9)) (define recomb-rates '((0.5 40) (0.25 400) (0.25 40))) (define new-positions (rescale-positions positions recomb-rates))
Rescales the positions in `positions' according to `recombination-rates', a list of pairs where the first element is a lenght (of the 0-1 interval) and the second is the recombination rate over that range. (The actual value of the rate is not important, only the relative differences between the rates in the list).
The `recombination-rates' list is translated into a piece-wise linear function, where the different regions of the 0-1 interval specified by the distances in the list have different slope
Let li be length i in `recombination-rages', and let bpi be the sum of the first i lengths: Σj<i li. Let ai be rate i in `recombination-rates', and let bi be the sum Σj<i ai*di.
The positions are then translated by linear functions: fi(x) = ai*x + bi, such that for positions p: bpi<p<= bpi+1 p is translated into fi(p).
The input positions must be sorted.
Sort a list of markers.
(sort-markers marker-list)
(sort-markers marker-list)
Sort a list of markers with relation to their position.
This module contains functions making random marker configurations.
Make a list of random micro-satellite markers.
(make-random-ms-markers no-markers theta K)
(define snp-markers (make-random-ms-markers 10 0.1 15))
Make a list of random positioned micro-satellite markers. The theta and K parameters are similar to the `ms-marker' function.
Make a list of random positions.
(make-random-positions no-positions)
(define positions (make-random-positions 10))
Makes a list of random positions between 0 and 1 (0 included, 1 excluded). The positions are returned sorted.
Make a list of random SNP markers.
(make-random-snp-markers no-markers low-freq high-freq)
(define snp-markers (make-random-snp-markers 10 0.1 0.9))
Make a list of random positioned SNP markers. The low and high frequency constraints are similar to the `snp-marker' function.
Make a list of random trait markers.
(make-random-trait-markers no-markers low-freq high-freq)
(define disease-marker (car (make-random-trait-markers 1 0.18 0.22)))
Make a list of random positioned trait markers. The low and high frequency constraints are similar to the `trait-marker' function.
This module contains functions for calculating various statistics on the ARG and on lists of haplotypes.
Tajima's D statistics.
(D lst)
(D '((0 1 1 1 1 0) (0 0 1 1 1 0) (0 1 1 0 1 0) (0 0 0 1 1 0)))
Calculates Tajima's D statistics on a list of haplotypes.
The number of mutations in a sample.
(S lst)
(S '((0 1 1 1 1 0) (0 0 1 1 1 0) (0 1 1 0 1 0) (0 0 0 1 1 0)))
Calculates the number of mutations in a sample (calculated as the number of informative sites).
Returns the number of coalescence events in the ARG.
(no-coalescence-events arg)
(define markers (make-random-snp-markers 10 0.1 0.9)) (define arg (simulate markers 100 :rho 400)) (define n (no-coalescence-events arg))
Returns the number of coalescence events in the ARG.
Returns the number of gene conversions in the ARG.
(no-gene-conversions arg)
(define markers (make-random-snp-markers 10 0.1 0.9)) (define arg (simulate markers 100 :gamma 10 :Q 0.2)) (define n (no-gene-conversions arg))
Returns the number of gene conversions in the ARG.
Returns the number of recombinations in the ARG.
(no-recombinations arg)
(define markers (make-random-snp-markers 10 0.1 0.9)) (define arg (simulate markers 100 :rho 400)) (define n (no-recombinations arg))
Returns the number of recombinations in the ARG.
Calculates the average pairwise difference.
(pi list-of-haplotypes)
(pi '((0 1 1 0) (1 0 1 0) (1 0 1 1)))
Calculates the average pairwise difference on a list of haplotypes.
Calculates the number of differences between two haplotypes.
(pi_ij si sj)
(pi_ij '(0 1 1 0) '(1 0 1 0))
Calculates the number of differences between two haplotypes.
Watterson's mutation rate estimate.
(theta_W lst)
(theta_W '((0 1 1 1 1 0) (0 0 1 1 1 0) (0 1 1 0 1 0) (0 0 0 1 1 0)))
Calculates Watterson's mutation rate estimate.
This module contains a step-wise mutation model for micro-satellite markers.
A step-wise mutation model marker.
(step-ms-marker position theta)
(step-ms-marker 0.1 1.5)
Creates a marker, at position pos, implementing the step-wise mutation model, i.e. whenever the marker mutates (and it does with rate theta) it increases or decreases by one with equal probability.
The ancestral value of the marker is 0, but this can be changed using the key-word parameter :initial-value.