Creating a BioMol Object from Scratch#

In this tutorial, you will:

  • Understand the role of BioMol as a unified container for molecular structures.

  • Learn how to define and organize atom, residue, and chain features.

  • Instantiate a BioMol object and explore its basic access patterns.

Importing Modules#

To create a BioMol instance, you will:

  1. Define features for atoms, residues, and chains.

  2. Build feature containers for each level.

  3. Create an index table to establish hierarchical relationships.

Let’s begin by importing the necessary modules:

import numpy as np

from biomol import BioMol
from biomol.core import EdgeFeature, FeatureContainer, IndexTable, NodeFeature

Defining Features#

Next, define features—such as element symbols or custom flags—for atoms, residues, and chains. Use NodeFeature for node-associated features or EdgeFeature for edge-associated features.

For edge features, the src_indices and dst_indices correspond to nodes in the container.

atom_positions = NodeFeature(
    value=np.array(
        [
            [0.0, 0.0, 0.0],
            [1.4, 0.0, 0.0],
            [1.4, 1.4, 0.0],
            [0.0, 1.4, 0.0],  # ALA-1
            [2.8, 0.0, 0.0],
            [4.2, 0.0, 0.0],
            [4.2, 1.4, 0.0],
            [2.8, 1.4, 0.0],  # GLY-2
            [5.6, 0.0, 0.0],
            [7.0, 0.0, 0.0],
            [7.0, 1.4, 0.0],
            [5.6, 1.4, 0.0],  # ALA-3
        ],
    ),
)

atom_names = NodeFeature(
    value=np.array(["N", "CA", "C", "O"] * 3),
)

atom_bond = EdgeFeature(
    value=np.array(
        [
            "covalent",
            "covalent",
            "covalent",
            "covalent",
            "covalent",
            "covalent",
            "covalent",
            "covalent",
            "covalent",
            "covalent",
            "covalent",
        ],
    ),
    src_indices=np.array([0, 1, 2, 2, 4, 5, 6, 6, 8, 9, 10]),
    dst_indices=np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]),
)
residue_ids = NodeFeature(value=np.array([1, 2, 3]))
residue_names = NodeFeature(value=np.array(["ALA", "GLY", "ALA"]))
chain_ids = NodeFeature(value=np.array(["A"]))
chain_entities = NodeFeature(value=np.array(["PROTEIN"]))

Building Containers and Index Tables#

After defining features, create FeatureContainer instances for atoms, residues, and chains. Each container must hold features with matching entry counts.

atom_container = FeatureContainer(
    {"positions": atom_positions, "name": atom_names, "bond": atom_bond},
)

residue_container = FeatureContainer(
    {"id": residue_ids, "name": residue_names},
)

chain_container = FeatureContainer(
    {"id": chain_ids, "entity": chain_entities},
)

Then, create an IndexTable to define hierarchical relationships among atoms, residues, and chains.

index_table = IndexTable.from_parents(
    atom_to_res=np.array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2]),
    res_to_chain=np.array([0, 0, 0]),
    n_chain=1,  # Optional, can be inferred from res_to_chain if not provided
)

Constructing the BioMol Object#

Finally, instantiate the BioMol object using the feature containers and index table you prepared.

mol = BioMol(
    atom_container=atom_container,
    residue_container=residue_container,
    chain_container=chain_container,
    index_table=index_table,
)

The BioMol class provides unified access at the atom, residue, and chain levels. You can access features with attribute-style syntax (e.g., mol.atoms.positions) or via the get_feature method for explicit retrieval.

mol.atoms.get_feature("positions")
# or
mol.atoms.positions
NodeFeature(value=array([[0. , 0. , 0. ],
       [1.4, 0. , 0. ],
       [1.4, 1.4, 0. ],
       [0. , 1.4, 0. ],
       [2.8, 0. , 0. ],
       [4.2, 0. , 0. ],
       [4.2, 1.4, 0. ],
       [2.8, 1.4, 0. ],
       [5.6, 0. , 0. ],
       [7. , 0. , 0. ],
       [7. , 1.4, 0. ],
       [5.6, 1.4, 0. ]]))
mol.residues.name
NodeFeature(value=array(['ALA', 'GLY', 'ALA'], dtype='<U3'))
mol.chains.id
NodeFeature(value=array(['A'], dtype='<U1'))