{
  "cells": [
    {
      "cell_type": "markdown",
      "id": "7722c406",
      "metadata": {},
      "source": [
        "# Creating a BioMol Object from Scratch\n",
        "\n",
        "```{eval-rst}\n",
        ".. currentmodule:: biomol\n",
        "```\n",
        "\n",
        "In this tutorial, you will:\n",
        "- Understand the role of {py:class}`BioMol` as a unified container for molecular structures.\n",
        "- Learn how to define and organize atom, residue, and chain features.\n",
        "- Instantiate a {py:class}`BioMol` object and explore its basic access patterns."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "8c9b8806",
      "metadata": {},
      "source": [
        "## Importing Modules\n",
        "\n",
        "To create a {py:class}`BioMol` instance, you will:\n",
        "\n",
        "1.\tDefine features for atoms, residues, and chains.\n",
        "2.\tBuild feature containers for each level.\n",
        "3.\tCreate an index table to establish hierarchical relationships.\n",
        "\n",
        "Let’s begin by importing the necessary modules:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 20,
      "id": "76fab666",
      "metadata": {},
      "outputs": [],
      "source": [
        "import numpy as np\n",
        "\n",
        "from biomol import BioMol\n",
        "from biomol.core import EdgeFeature, FeatureContainer, IndexTable, NodeFeature"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "98160a94",
      "metadata": {},
      "source": [
        "## Defining Features\n",
        "\n",
        "Next, define features—such as element symbols or custom flags—for atoms, residues, and chains.\n",
        "Use {py:class}`NodeFeature <core.NodeFeature>` for node-associated features or {py:class}`EdgeFeature <core.EdgeFeature>` for edge-associated features. \n",
        "\n",
        "For edge features, the {py:class}`src_indices <core.EdgeFeature.src_indices>` and {py:class}`dst_indices <core.EdgeFeature.dst_indices>` correspond to nodes in the container."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 21,
      "id": "d325974c",
      "metadata": {},
      "outputs": [],
      "source": [
        "atom_positions = NodeFeature(\n",
        "    value=np.array(\n",
        "        [\n",
        "            [0.0, 0.0, 0.0],\n",
        "            [1.4, 0.0, 0.0],\n",
        "            [1.4, 1.4, 0.0],\n",
        "            [0.0, 1.4, 0.0],  # ALA-1\n",
        "            [2.8, 0.0, 0.0],\n",
        "            [4.2, 0.0, 0.0],\n",
        "            [4.2, 1.4, 0.0],\n",
        "            [2.8, 1.4, 0.0],  # GLY-2\n",
        "            [5.6, 0.0, 0.0],\n",
        "            [7.0, 0.0, 0.0],\n",
        "            [7.0, 1.4, 0.0],\n",
        "            [5.6, 1.4, 0.0],  # ALA-3\n",
        "        ],\n",
        "    ),\n",
        ")\n",
        "\n",
        "atom_names = NodeFeature(\n",
        "    value=np.array([\"N\", \"CA\", \"C\", \"O\"] * 3),\n",
        ")\n",
        "\n",
        "atom_bond = EdgeFeature(\n",
        "    value=np.array(\n",
        "        [\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "            \"covalent\",\n",
        "        ],\n",
        "    ),\n",
        "    src_indices=np.array([0, 1, 2, 2, 4, 5, 6, 6, 8, 9, 10]),\n",
        "    dst_indices=np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]),\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 22,
      "id": "5d5c518b",
      "metadata": {},
      "outputs": [],
      "source": [
        "residue_ids = NodeFeature(value=np.array([1, 2, 3]))\n",
        "residue_names = NodeFeature(value=np.array([\"ALA\", \"GLY\", \"ALA\"]))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 23,
      "id": "e78fc3cf",
      "metadata": {},
      "outputs": [],
      "source": [
        "chain_ids = NodeFeature(value=np.array([\"A\"]))\n",
        "chain_entities = NodeFeature(value=np.array([\"PROTEIN\"]))"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "9b18b542",
      "metadata": {},
      "source": [
        "## Building Containers and Index Tables\n",
        "\n",
        "After defining features, create {py:class}`FeatureContainer <core.FeatureContainer>` instances for atoms, residues, and chains. Each container must hold features with matching entry counts."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 24,
      "id": "7eea721c",
      "metadata": {},
      "outputs": [],
      "source": [
        "atom_container = FeatureContainer(\n",
        "    {\"positions\": atom_positions, \"name\": atom_names, \"bond\": atom_bond},\n",
        ")\n",
        "\n",
        "residue_container = FeatureContainer(\n",
        "    {\"id\": residue_ids, \"name\": residue_names},\n",
        ")\n",
        "\n",
        "chain_container = FeatureContainer(\n",
        "    {\"id\": chain_ids, \"entity\": chain_entities},\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "e99b72ac",
      "metadata": {},
      "source": [
        "Then, create an {py:class}`IndexTable <core.IndexTable>` to define hierarchical relationships among atoms, residues, and chains."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 25,
      "id": "8f9d0164",
      "metadata": {},
      "outputs": [],
      "source": [
        "index_table = IndexTable.from_parents(\n",
        "    atom_to_res=np.array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2]),\n",
        "    res_to_chain=np.array([0, 0, 0]),\n",
        "    n_chain=1,  # Optional, can be inferred from res_to_chain if not provided\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "ad2ee764",
      "metadata": {},
      "source": [
        "## Constructing the BioMol Object\n",
        "\n",
        "Finally, instantiate the {py:class}`BioMol <biomol.BioMol>` object using the feature containers and index table you prepared."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 26,
      "id": "6e2af0c7",
      "metadata": {},
      "outputs": [],
      "source": [
        "mol = BioMol(\n",
        "    atom_container=atom_container,\n",
        "    residue_container=residue_container,\n",
        "    chain_container=chain_container,\n",
        "    index_table=index_table,\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "ea980da8",
      "metadata": {},
      "source": [
        "The {py:class}`BioMol` class provides unified access at the **atom**, **residue**, and **chain** levels. \n",
        "You can access features with attribute-style syntax (e.g., `mol.atoms.positions`) or via the {py:class}`get_feature <core.View.get_feature>` method for explicit retrieval."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 27,
      "id": "9dbfbbb3",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "NodeFeature(value=array([[0. , 0. , 0. ],\n",
              "       [1.4, 0. , 0. ],\n",
              "       [1.4, 1.4, 0. ],\n",
              "       [0. , 1.4, 0. ],\n",
              "       [2.8, 0. , 0. ],\n",
              "       [4.2, 0. , 0. ],\n",
              "       [4.2, 1.4, 0. ],\n",
              "       [2.8, 1.4, 0. ],\n",
              "       [5.6, 0. , 0. ],\n",
              "       [7. , 0. , 0. ],\n",
              "       [7. , 1.4, 0. ],\n",
              "       [5.6, 1.4, 0. ]]), description=None)"
            ]
          },
          "execution_count": 27,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "mol.atoms.get_feature(\"positions\")\n",
        "# or\n",
        "mol.atoms.positions"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 28,
      "id": "74afc8c7",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "NodeFeature(value=array(['ALA', 'GLY', 'ALA'], dtype='<U3'), description=None)"
            ]
          },
          "execution_count": 28,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "mol.residues.name"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 29,
      "id": "012e30d4",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "NodeFeature(value=array(['A'], dtype='<U1'), description=None)"
            ]
          },
          "execution_count": 29,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "mol.chains.id"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "default",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.18"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}