{ "cells": [ { "cell_type": "markdown", "id": "7722c406", "metadata": {}, "source": [ "# Creating a BioMol Object from Scratch\n", "\n", "```{eval-rst}\n", ".. currentmodule:: biomol\n", "```\n", "\n", "In this tutorial, you will:\n", "- Understand the role of {py:class}`BioMol` as a unified container for molecular structures.\n", "- Learn how to define and organize atom, residue, and chain features.\n", "- Instantiate a {py:class}`BioMol` object and explore its basic access patterns." ] }, { "cell_type": "markdown", "id": "8c9b8806", "metadata": {}, "source": [ "## Importing Modules\n", "\n", "To create a {py:class}`BioMol` instance, you will:\n", "\n", "1.\tDefine features for atoms, residues, and chains.\n", "2.\tBuild feature containers for each level.\n", "3.\tCreate an index table to establish hierarchical relationships.\n", "\n", "Let’s begin by importing the necessary modules:" ] }, { "cell_type": "code", "execution_count": 20, "id": "76fab666", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "from biomol import BioMol\n", "from biomol.core import EdgeFeature, FeatureContainer, IndexTable, NodeFeature" ] }, { "cell_type": "markdown", "id": "98160a94", "metadata": {}, "source": [ "## Defining Features\n", "\n", "Next, define features—such as element symbols or custom flags—for atoms, residues, and chains.\n", "Use {py:class}`NodeFeature ` for node-associated features or {py:class}`EdgeFeature ` for edge-associated features. \n", "\n", "For edge features, the {py:class}`src_indices ` and {py:class}`dst_indices ` correspond to nodes in the container." ] }, { "cell_type": "code", "execution_count": 21, "id": "d325974c", "metadata": {}, "outputs": [], "source": [ "atom_positions = NodeFeature(\n", " value=np.array(\n", " [\n", " [0.0, 0.0, 0.0],\n", " [1.4, 0.0, 0.0],\n", " [1.4, 1.4, 0.0],\n", " [0.0, 1.4, 0.0], # ALA-1\n", " [2.8, 0.0, 0.0],\n", " [4.2, 0.0, 0.0],\n", " [4.2, 1.4, 0.0],\n", " [2.8, 1.4, 0.0], # GLY-2\n", " [5.6, 0.0, 0.0],\n", " [7.0, 0.0, 0.0],\n", " [7.0, 1.4, 0.0],\n", " [5.6, 1.4, 0.0], # ALA-3\n", " ],\n", " ),\n", ")\n", "\n", "atom_names = NodeFeature(\n", " value=np.array([\"N\", \"CA\", \"C\", \"O\"] * 3),\n", ")\n", "\n", "atom_bond = EdgeFeature(\n", " value=np.array(\n", " [\n", " \"covalent\",\n", " \"covalent\",\n", " \"covalent\",\n", " \"covalent\",\n", " \"covalent\",\n", " \"covalent\",\n", " \"covalent\",\n", " \"covalent\",\n", " \"covalent\",\n", " \"covalent\",\n", " \"covalent\",\n", " ],\n", " ),\n", " src_indices=np.array([0, 1, 2, 2, 4, 5, 6, 6, 8, 9, 10]),\n", " dst_indices=np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]),\n", ")" ] }, { "cell_type": "code", "execution_count": 22, "id": "5d5c518b", "metadata": {}, "outputs": [], "source": [ "residue_ids = NodeFeature(value=np.array([1, 2, 3]))\n", "residue_names = NodeFeature(value=np.array([\"ALA\", \"GLY\", \"ALA\"]))" ] }, { "cell_type": "code", "execution_count": 23, "id": "e78fc3cf", "metadata": {}, "outputs": [], "source": [ "chain_ids = NodeFeature(value=np.array([\"A\"]))\n", "chain_entities = NodeFeature(value=np.array([\"PROTEIN\"]))" ] }, { "cell_type": "markdown", "id": "9b18b542", "metadata": {}, "source": [ "## Building Containers and Index Tables\n", "\n", "After defining features, create {py:class}`FeatureContainer ` instances for atoms, residues, and chains. Each container must hold features with matching entry counts." ] }, { "cell_type": "code", "execution_count": 24, "id": "7eea721c", "metadata": {}, "outputs": [], "source": [ "atom_container = FeatureContainer(\n", " {\"positions\": atom_positions, \"name\": atom_names, \"bond\": atom_bond},\n", ")\n", "\n", "residue_container = FeatureContainer(\n", " {\"id\": residue_ids, \"name\": residue_names},\n", ")\n", "\n", "chain_container = FeatureContainer(\n", " {\"id\": chain_ids, \"entity\": chain_entities},\n", ")" ] }, { "cell_type": "markdown", "id": "e99b72ac", "metadata": {}, "source": [ "Then, create an {py:class}`IndexTable ` to define hierarchical relationships among atoms, residues, and chains." ] }, { "cell_type": "code", "execution_count": 25, "id": "8f9d0164", "metadata": {}, "outputs": [], "source": [ "index_table = IndexTable.from_parents(\n", " atom_to_res=np.array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2]),\n", " res_to_chain=np.array([0, 0, 0]),\n", " n_chain=1, # Optional, can be inferred from res_to_chain if not provided\n", ")" ] }, { "cell_type": "markdown", "id": "ad2ee764", "metadata": {}, "source": [ "## Constructing the BioMol Object\n", "\n", "Finally, instantiate the {py:class}`BioMol ` object using the feature containers and index table you prepared." ] }, { "cell_type": "code", "execution_count": 26, "id": "6e2af0c7", "metadata": {}, "outputs": [], "source": [ "mol = BioMol(\n", " atom_container=atom_container,\n", " residue_container=residue_container,\n", " chain_container=chain_container,\n", " index_table=index_table,\n", ")" ] }, { "cell_type": "markdown", "id": "ea980da8", "metadata": {}, "source": [ "The {py:class}`BioMol` class provides unified access at the **atom**, **residue**, and **chain** levels. \n", "You can access features with attribute-style syntax (e.g., `mol.atoms.positions`) or via the {py:class}`get_feature ` method for explicit retrieval." ] }, { "cell_type": "code", "execution_count": 27, "id": "9dbfbbb3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "NodeFeature(value=array([[0. , 0. , 0. ],\n", " [1.4, 0. , 0. ],\n", " [1.4, 1.4, 0. ],\n", " [0. , 1.4, 0. ],\n", " [2.8, 0. , 0. ],\n", " [4.2, 0. , 0. ],\n", " [4.2, 1.4, 0. ],\n", " [2.8, 1.4, 0. ],\n", " [5.6, 0. , 0. ],\n", " [7. , 0. , 0. ],\n", " [7. , 1.4, 0. ],\n", " [5.6, 1.4, 0. ]]), description=None)" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mol.atoms.get_feature(\"positions\")\n", "# or\n", "mol.atoms.positions" ] }, { "cell_type": "code", "execution_count": 28, "id": "74afc8c7", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "NodeFeature(value=array(['ALA', 'GLY', 'ALA'], dtype='