Theoretical Biological Physics Computational Biology - P  

pseudolikelihood maximization Direct-Coupling Analysis (plmDCA)

by Magnus Ekeberg

This web page contains MATLAB-code (and accompanying C-written routines) for plmDCA. plmDCA takes as input a Multiple Sequence Alignment and returns scores for pairwise (direct) interactions among the columns. The method is described at length in:

(1) M. Ekeberg, C. Lövkvist, Y. Lan, M. Weigt, E. Aurell, Improved contact prediction in proteins: Using pseudolikelihoods to infer Potts models, Phys. Rev. E 87, 012707 (2013)
(2) M. Ekeberg, T. Hartonen, E. Aurell, Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences, J. Comput. Phys. 276, 341-356 (2014)

If you use plmDCA (modified or as is) for your own research, please cite the papers above. See the files for copyright conditions and instructions on how to use the code. Send comments and suggestions to magnus.ekeberg (at) gmail (dot) com.

There are two versions of plmDCA: the original "symmetric" version from (1), and the new "asymmetric" from (2). The latter produces almost identical output as the original, but is inherently faster and better-suited for parallel execution. At present, we therefore only provide the newer variant (for access to the older version, please contact us directly).

Update Aug 11: v2 of asymmetric plmDCA uploaded

Changes from v1:
- Regularization strengths are now chosen automatically by the program, i.e. they are no longer input arguments. If the number of nonredundant homogous sequences B_eff is >500, the previously recommended standard values lambda_h=lambda_J=0.01 are used, whereas for B_eff<=500 the value is now instead taken as 0.1-(0.1-0.01)*B_eff/500. This because we have observed a slight boost in accuracy when using stronger regularization at low B_eff.
- Contributions from the gap-state are now excluded during computation of the coevolution score. This has been found to yield a small but consistent improvement.

Download plmDCA

If download does not work, please use Github: http//

Please fill in the form below to download:

*compulsory fields

Personal information given on this page will be used for internal usage statistics.

Keywords: plmDCA, pseudolikelihood, direct-coupling analysis, protein structure prediction, contact map, multiple sequence alignment, inverse Ising, Potts model, pairwise Markov random field, learning, inference

Published by Nicolas Innocenti
Valid HTML 4.01 Transitional