diff --git a/blog/2025-01-08-welcome/2025-01-08-index.md b/blog/2025-01-08-welcome/2025-01-08-index.md
new file mode 100644
index 0000000..aee6862
--- /dev/null
+++ b/blog/2025-01-08-welcome/2025-01-08-index.md
@@ -0,0 +1,30 @@
+---
+slug: welcome-to-riverxdata
+title: Welcome to RiverXData
+authors: [river]
+tags: [cloud, data-analysis, slurm, hpc, web-platform]
+---
+
+## Unlock the Power of Cloud-Based Data Analysis with RiverXData
+
+
+Welcome to **RiverXData**, a data platform designed for scalable and efficient data analysis. Built on top of **SLURM**, RiverXData brings the power of high-performance computing (HPC) to a user-friendly web-based interface, enabling researchers, engineers, and data scientists to run complex computational tasks with ease.
+
+
+
+### Why RiverXData?
+
+Modern data analysis requires powerful computation, but traditional HPC environments can be challenging to access and manage. RiverXData simplifies this by providing:
+
+- **Seamless Web-Based Access**: Run jobs, manage workloads, and analyze results all from a web browser—no command-line expertise required.
+- **Scalability & Performance**: Harness the power of SLURM for efficient job scheduling and resource management.
+- **Cloud Flexibility**: Deploy, scale, and optimize computational tasks without worrying about infrastructure.
+- **User-Friendly Interface**: Modern web based job in slurm can be accessed interactively
+
+### Get Started Today
+
+Whether you're analyzing large datasets, running simulations, or training machine learning models, RiverXData empowers you to leverage the full potential of cloud-based HPC with minimal setup.
+
+Stay tuned for tutorials, feature updates, and real-world use cases as we continue to improve RiverXData.
+
+🔗 **Explore more at [RiverXData](#)**
diff --git a/blog/2025-01-08-welcome/riverxdata.svg b/blog/2025-01-08-welcome/riverxdata.svg
new file mode 100644
index 0000000..e6809ee
--- /dev/null
+++ b/blog/2025-01-08-welcome/riverxdata.svg
@@ -0,0 +1,4 @@
+
+
+
+
\ No newline at end of file
diff --git a/blog/2025-03-14/how-to-build-slurm-single-node-with-full-functions.md b/blog/2025-03-14/how-to-build-slurm-single-node-with-full-functions.md
new file mode 100644
index 0000000..d2f7265
--- /dev/null
+++ b/blog/2025-03-14/how-to-build-slurm-single-node-with-full-functions.md
@@ -0,0 +1,356 @@
+---
+slug: single-node-slurm-setup
+title: Single Node Slurm Setup
+authors: [river]
+tags: [slurm, hpc]
+---
+
+
+# Single Node Slurm Setup
+
+
+SLURM (Simple Linux Utility for Resource Management) is a powerful open-source workload manager commonly used in high-performance computing (HPC) environments. While it is typically deployed across multi-node clusters, setting up SLURM on a single node can be highly beneficial for testing, development, or running lightweight workloads. This guide will help you understand the fundamental concepts of how a scheduler operates and provide a step-by-step walkthrough to configure SLURM with full functionality on a single-node setup.
+
+
+
+**How to Set Up a Fully Functional SLURM Cluster on a Single Node (with Proper Resource Constraints)**
+SLURM is a powerful job scheduler widely used in HPC environments, but configuring it properly—especially on a single-node cluster—can be tricky. Many tutorials cover the basics, like installing SLURM and running jobs, but they often overlook a critical aspect: cgroups (control groups).
+
+Without proper cgroup configuration, SLURM may fail to enforce CPU and memory constraints, allowing jobs to consume more resources than allocated. This can lead to performance degradation, system crashes, or unexpected behavior.
+
+In this guide, I’ll walk you through setting up a fully functional SLURM cluster on a single node, ensuring that CPU and memory limits are properly enforced using cgroups. Whether you're setting up a test environment or a lightweight HPC system, this tutorial will help you avoid common pitfalls and ensure that SLURM effectively manages resources as expected.
+
+Let’s dive in
+
+:::info
+- This architecture is designed for a single node running Ubuntu 20.04.
+- It supports all standard Slurm features.
+- The setup is manual to help you understand how Slurm works. Note that users can utilize resources without submitting jobs, so this configuration is not recommended for production environments.
+:::
+
+## **Install `slurmd` and `slurmctld`**
+
+Install the required software:
+
+```bash
+sudo apt-get update -y && sudo apt-get install -y slurmd slurmctld
+```
+
+Verify the installation:
+
+```bash
+# Locate slurmd and slurmctld
+which slurmd
+# Output: /usr/sbin/slurmd
+which slurmctld
+# Output: /usr/sbin/slurmctld
+```
+
+## **Prepare `slurm.conf`**
+
+:::info
+- This configuration applies to all nodes.
+:::
+
+Create the `slurm.conf` file:
+
+```bash
+cat < slurm.conf
+# slurm.conf for a single-node Slurm cluster with accounting
+ClusterName=localcluster
+SlurmctldHost=localhost
+MpiDefault=none
+ProctrackType=proctrack/linuxproc
+ReturnToService=2
+SlurmctldPidFile=/run/slurmctld.pid
+SlurmctldPort=6817
+SlurmdPidFile=/run/slurmd.pid
+SlurmdPort=6818
+SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
+SlurmUser=slurm
+StateSaveLocation=/var/lib/slurm-llnl/slurmctld
+SwitchType=switch/none
+TaskPlugin=task/none
+
+# TIMERS
+InactiveLimit=0
+KillWait=30
+MinJobAge=300
+SlurmctldTimeout=120
+SlurmdTimeout=300
+Waittime=0
+
+# SCHEDULING
+SchedulerType=sched/backfill
+SelectType=select/cons_tres
+SelectTypeParameters=CR_Core
+
+# ACCOUNTING (slurmdbd, not enabled now)
+AccountingStorageType=accounting_storage/none
+JobAcctGatherType=jobacct_gather/none
+JobAcctGatherFrequency=30
+
+# LOGGING
+SlurmctldDebug=info
+SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
+SlurmdDebug=info
+SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
+
+# COMPUTE NODES (Single-node configuration)
+NodeName=localhost CPUs=2 Sockets=1 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=1024 State=UNKNOWN
+
+# PARTITION CONFIGURATION
+PartitionName=LocalQ Nodes=ALL Default=YES MaxTime=INFINITE State=UP
+EOF
+```
+
+Move the file to the correct location:
+
+```bash
+sudo mv slurm.conf /etc/slurm-llnl/slurm.conf
+```
+
+## **Start Basic Slurm Services**
+
+Start the `slurmd` service:
+
+```bash
+# Start the service
+sudo service slurmd start
+# Check its status
+sudo service slurmd status
+```
+
+
+ 
+
+
+Start the `slurmctld` service:
+
+```bash
+# Start the service
+sudo service slurmctld start
+# Check its status
+sudo service slurmctld status
+```
+
+
+ 
+
+
+Submit a small job (adjust CPUs and memory as needed):
+
+```bash
+srun --mem 500MB -c 1 --pty bash
+# Check details of submitted jobs
+squeue -o "%i %P %u %T %M %l %D %C %m %R %Z %N" | column -t
+```
+
+Before submitting the job, memory usage is less than 200MB:
+
+
+ 
+
+
+Allocate 100MB of memory repeatedly:
+
+```bash
+declare -a mem
+i=0
+
+while :; do
+ mem[$i]=$(head -c 100M
+ 
+
+
+## **Limit Resources Using cgroups**
+
+Create a `cgroup.conf` file to restrict resource usage:
+
+```bash
+cat <cgroup.conf
+CgroupAutomount=yes
+CgroupMountpoint=/sys/fs/cgroup
+ConstrainCores=yes
+ConstrainRAMSpace=yes
+ConstrainDevices=yes
+ConstrainSwapSpace=yes
+MaxSwapPercent=5
+MemorySwappiness=0
+EOF
+```
+
+Move the file to the correct directory:
+
+```bash
+sudo mv cgroup.conf /etc/slurm-llnl/cgroup.conf
+```
+
+Update `slurm.conf` to enable cgroup plugins:
+
+```bash
+sudo sed -i -e "s|ProctrackType=proctrack/linuxproc|ProctrackType=proctrack/cgroup|" \
+ -e "s|TaskPlugin=task/none|TaskPlugin=task/cgroup|" /etc/slurm-llnl/slurm.conf
+```
+
+Enable cgroup in GRUB and reboot:
+
+```bash
+sudo sed -i 's/^GRUB_CMDLINE_LINUX="/GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1 /' /etc/default/grub
+sudo update-grub
+sudo reboot
+```
+
+Restart Slurm services:
+
+```bash
+sudo service slurmctld restart
+sudo service slurmd restart
+```
+
+Rerun the memory allocation job. This time, the job will be killed when it exceeds the memory limit:
+
+```bash
+srun --mem 500MB -c 1 --pty bash
+declare -a mem
+i=0
+
+while :; do
+ mem[$i]=$(head -c 100M
+ 
+
+
+## **Enable Slurm Accounting**
+
+Accounting allows monitoring of jobs, resource allocation, and permissions.
+
+
+ 
+
+
+### Install `slurmdbd`
+
+```bash
+sudo apt-get install slurmdbd mariadb-server -y
+```
+
+### Configure `slurmdbd.conf`
+
+:::info
+- Enables the accounting plugin to store account information.
+- Maps Linux users to Slurm accounts. Users cannot submit jobs without being added.
+- Useful for monitoring jobs and optimizing resource usage.
+:::
+
+Create the `slurmdbd.conf` file:
+
+```bash
+cat <slurmdbd.conf
+PidFile=/run/slurmdbd.pid
+LogFile=/var/log/slurm/slurmdbd.log
+DebugLevel=error
+DbdHost=localhost
+DbdPort=6819
+
+# DB connection data
+StorageType=accounting_storage/mysql
+StorageHost=localhost
+StoragePort=3306
+StorageUser=slurm
+StoragePass=slurm
+StorageLoc=slurm_acct_db
+SlurmUser=slurm
+EOF
+```
+
+Move the file to the correct location:
+
+```bash
+sudo mv slurmdbd.conf /etc/slurm-llnl/slurmdbd.conf
+```
+
+### Create the Database
+
+Create the database and user:
+
+```bash
+sudo service mysql start
+sudo mysql -e "CREATE DATABASE slurm_acct_db;" && \
+sudo mysql -e "CREATE USER 'slurm'@'localhost' IDENTIFIED BY 'slurm';" && \
+sudo mysql -e "GRANT ALL PRIVILEGES ON slurm_acct_db.* TO 'slurm'@'localhost';" && \
+sudo mysql -e "FLUSH PRIVILEGES;"
+```
+
+Verify the database and user:
+
+```bash
+sudo mysql -e "SHOW DATABASES;"
+sudo mysql -e "SELECT User, Host FROM mysql.user;"
+sudo mysql -e "SHOW GRANTS FOR 'slurm'@'localhost';"
+```
+
+
+ 
+
+
+### Start `slurmdbd` Service
+
+```bash
+sudo service slurmdbd start
+```
+
+Update `slurm.conf` to enable accounting:
+
+```bash
+sudo sed -i -e "s|AccountingStorageType=accounting_storage/none|AccountingStorageType=accounting_storage/slurmdbd\nAccountingStorageEnforce=associations,limits,qos\nAccountingStorageHost=localhost\nAccountingStoragePort=6819|" /etc/slurm-llnl/slurm.conf
+sudo sed -i -e "s|JobAcctGatherType=jobacct_gather/none|JobAcctGatherType=jobacct_gather/cgroup|" /etc/slurm-llnl/slurm.conf
+sudo systemctl restart slurmctl slurmd
+```
+
+Add Linux users to Slurm accounting:
+
+```bash
+sudo sacctmgr -i add cluster localcluster
+sudo sacctmgr -i --quiet add account $USER Cluster=localcluster
+sudo sacctmgr -i --quiet add user $USER account=$USER DefaultAccount=root
+sudo systemctl restart slurmctl slurmd
+```
+
+
+ 
+
+
+### Submit a Job and View Metrics
+
+Submit a job:
+
+```bash
+srun --mem 500MB -c 1 --pty bash
+```
+
+
+ 
+
+
+## **Conclusion**
+
+:::info
+- Slurm is widely used in academic and industrial settings for orchestrating distributed jobs across multiple nodes.
+- While Slurm is relatively easy to set up, critical steps like resource limits and accounting are often overlooked.
+- Slurm integrates seamlessly with distributed computing frameworks like Spark, Ray, Dask, and Flink, enabling efficient resource utilization for local development.
+:::
+
diff --git a/blog/2025-03-14/img/HPC_architecture.png b/blog/2025-03-14/img/HPC_architecture.png
new file mode 100644
index 0000000..9080cad
Binary files /dev/null and b/blog/2025-03-14/img/HPC_architecture.png differ
diff --git a/blog/2025-03-14/img/NFS.png b/blog/2025-03-14/img/NFS.png
new file mode 100644
index 0000000..e15bf4b
Binary files /dev/null and b/blog/2025-03-14/img/NFS.png differ
diff --git a/blog/2025-03-14/img/Slurm_logo.svg b/blog/2025-03-14/img/Slurm_logo.svg
new file mode 100644
index 0000000..3d1e7d2
--- /dev/null
+++ b/blog/2025-03-14/img/Slurm_logo.svg
@@ -0,0 +1,6 @@
+
+
+
+
+
+
diff --git a/blog/2025-03-14/img/account_usage.png b/blog/2025-03-14/img/account_usage.png
new file mode 100644
index 0000000..7e36973
Binary files /dev/null and b/blog/2025-03-14/img/account_usage.png differ
diff --git a/blog/2025-03-14/img/add_account.png b/blog/2025-03-14/img/add_account.png
new file mode 100644
index 0000000..f47b44e
Binary files /dev/null and b/blog/2025-03-14/img/add_account.png differ
diff --git a/blog/2025-03-14/img/add_db.png b/blog/2025-03-14/img/add_db.png
new file mode 100644
index 0000000..8912423
Binary files /dev/null and b/blog/2025-03-14/img/add_db.png differ
diff --git a/blog/2025-03-14/img/config_1.png b/blog/2025-03-14/img/config_1.png
new file mode 100644
index 0000000..2459339
Binary files /dev/null and b/blog/2025-03-14/img/config_1.png differ
diff --git a/blog/2025-03-14/img/config_2.png b/blog/2025-03-14/img/config_2.png
new file mode 100644
index 0000000..1c046bb
Binary files /dev/null and b/blog/2025-03-14/img/config_2.png differ
diff --git a/blog/2025-03-14/img/config_3.png b/blog/2025-03-14/img/config_3.png
new file mode 100644
index 0000000..b6d8e82
Binary files /dev/null and b/blog/2025-03-14/img/config_3.png differ
diff --git a/blog/2025-03-14/img/create_app.png b/blog/2025-03-14/img/create_app.png
new file mode 100644
index 0000000..fd57cdc
Binary files /dev/null and b/blog/2025-03-14/img/create_app.png differ
diff --git a/blog/2025-03-14/img/grafana.png b/blog/2025-03-14/img/grafana.png
new file mode 100644
index 0000000..9b49679
Binary files /dev/null and b/blog/2025-03-14/img/grafana.png differ
diff --git a/blog/2025-03-14/img/grafana_down.png b/blog/2025-03-14/img/grafana_down.png
new file mode 100644
index 0000000..2eab33d
Binary files /dev/null and b/blog/2025-03-14/img/grafana_down.png differ
diff --git a/blog/2025-03-14/img/job_sinfo.png b/blog/2025-03-14/img/job_sinfo.png
new file mode 100644
index 0000000..59ce33d
Binary files /dev/null and b/blog/2025-03-14/img/job_sinfo.png differ
diff --git a/blog/2025-03-14/img/memory_before_stress.png b/blog/2025-03-14/img/memory_before_stress.png
new file mode 100644
index 0000000..7c8405b
Binary files /dev/null and b/blog/2025-03-14/img/memory_before_stress.png differ
diff --git a/blog/2025-03-14/img/node_grafana.png b/blog/2025-03-14/img/node_grafana.png
new file mode 100644
index 0000000..fa45d7c
Binary files /dev/null and b/blog/2025-03-14/img/node_grafana.png differ
diff --git a/blog/2025-03-14/img/oom.png b/blog/2025-03-14/img/oom.png
new file mode 100644
index 0000000..d5f09a2
Binary files /dev/null and b/blog/2025-03-14/img/oom.png differ
diff --git a/blog/2025-03-14/img/overresource_limit.png b/blog/2025-03-14/img/overresource_limit.png
new file mode 100644
index 0000000..26cc451
Binary files /dev/null and b/blog/2025-03-14/img/overresource_limit.png differ
diff --git a/blog/2025-03-14/img/resume_node.png b/blog/2025-03-14/img/resume_node.png
new file mode 100644
index 0000000..c3db903
Binary files /dev/null and b/blog/2025-03-14/img/resume_node.png differ
diff --git a/blog/2025-03-14/img/sacct_disable.png b/blog/2025-03-14/img/sacct_disable.png
new file mode 100644
index 0000000..ea40e0c
Binary files /dev/null and b/blog/2025-03-14/img/sacct_disable.png differ
diff --git a/blog/2025-03-14/img/slurm.svg b/blog/2025-03-14/img/slurm.svg
new file mode 100644
index 0000000..efc89ff
--- /dev/null
+++ b/blog/2025-03-14/img/slurm.svg
@@ -0,0 +1,4 @@
+
+
+
+
Private subnet
Private subnet
Partition: standard
Partition: standard
16 CPUs, 32 GB RAM,
16 CPUs, 32 GB RAM,
Allocated resources: 4 CPUs, 8 GB RAM
Allocated resources: 4 CPUs, 8 GB RAM
X4 Jobs
X4 Jobs
Allocated resources: 2 CPUs, 4 GB RAM
Allocated resources: 2 CPUs, 4 GB RAM
X8 Jobs
X8 Jobs
Allocated resources: 1 CPUs, 32 GB RAM
Allocated resources: 1 CPUs, 32 GB RAM
X1 Jobs
X1 Jobs
or
or
or
or
16 CPUs, 32 GB RAM,
16 CPUs, 32 GB RAM,
Allocated resources: 4 CPUs, 8 GB RAM
Allocated resources: 4 CPUs, 8 GB RAM
X4 Jobs
X4 Jobs
Allocated resources: 2 CPUs, 4 GB RAM
Allocated resources: 2 CPUs, 4 GB RAM
X8 Jobs
X8 Jobs
Allocated resources: 1 CPUs, 32 GB RAM
Allocated resources: 1 CPUs, 32 GB RAM
X1 Jobs
X1 Jobs
or
or
or
or
Partition: gpu
Partition: gpu
Allocated resources: 2 CPUs, 4 GB RAM
Allocated resources: 2 CPUs, 4 GB RAM
16 CPUs, 32 GB RAM, 2 Quadro 4000
16 CPUs, 32 GB RAM, 2 Quadro 4000
Allocated resources: 1 GPU
Allocated resources: 1 GPU
X2 Jobs
X2 Jobs
X8 Jobs
X8 Jobs
Allocated resources: 1 CPUs, 32 GB RAM
Allocated resources: 1 CPUs, 32 GB RAM
X1 Jobs
X1 Jobs
or
or
or
or
Allocated resources: 2 CPUs, 4 GB RAM
Allocated resources: 2 CPUs, 4 GB RAM
16 CPUs, 32 GB RAM, 2 Quadro 4000
16 CPUs, 32 GB RAM, 2 Quadro 4000
Allocated resources: 1 GPU
Allocated resources: 1 GPU
X2 Jobs
X2 Jobs
X8 Jobs
X8 Jobs
Allocated resources: 1 CPUs, 32 GB RAM
Allocated resources: 1 CPUs, 32 GB RAM
X1 Jobs
X1 Jobs
or
or
or
or
Partition: high memory
Partition: high memory
Allocated resources: 4 CPUs, 256 GB RAM
Allocated resources: 4 CPUs, 256 GB RAM
16 CPUs, 256 GB RAM
16 CPUs, 256 GB RAM
Allocated resources: 8 CPUs, 128 GB RAM
Allocated resources: 8 CPUs, 128 GB RAM
X2 Jobs
X2 Jobs
or
or
X1 Jobs
X1 Jobs
Public subnet
Public subnet
Login and controller
Login and controller
Slurmdbd Slurmcltd NIS server NFS server:
/home/
Slurmdbd...
Authorized user-ssh
Authorized us...
Storage
Storage
NAS Synology
NAS Synol...
S3 Storage
S3 Storage
SLURM STARNDARD INFRASTRUCTURE
SLURM STARNDARD INFRASTRUCTURE
\ No newline at end of file
diff --git a/blog/2025-03-14/img/slurm_arch.gif b/blog/2025-03-14/img/slurm_arch.gif
new file mode 100644
index 0000000..7f5d7b7
Binary files /dev/null and b/blog/2025-03-14/img/slurm_arch.gif differ
diff --git a/blog/2025-03-14/img/slurm_grafana.png b/blog/2025-03-14/img/slurm_grafana.png
new file mode 100644
index 0000000..88678fd
Binary files /dev/null and b/blog/2025-03-14/img/slurm_grafana.png differ
diff --git a/blog/2025-03-14/img/slurmctld_status.png b/blog/2025-03-14/img/slurmctld_status.png
new file mode 100644
index 0000000..910d377
Binary files /dev/null and b/blog/2025-03-14/img/slurmctld_status.png differ
diff --git a/blog/2025-03-14/img/slurmd_status.png b/blog/2025-03-14/img/slurmd_status.png
new file mode 100644
index 0000000..111b02c
Binary files /dev/null and b/blog/2025-03-14/img/slurmd_status.png differ
diff --git a/blog/2025-03-14/img/small_HPC.jpg b/blog/2025-03-14/img/small_HPC.jpg
new file mode 100644
index 0000000..5f113d0
Binary files /dev/null and b/blog/2025-03-14/img/small_HPC.jpg differ
diff --git a/blog/2025-03-14/img/srun_fastqc.png b/blog/2025-03-14/img/srun_fastqc.png
new file mode 100644
index 0000000..e8ca230
Binary files /dev/null and b/blog/2025-03-14/img/srun_fastqc.png differ
diff --git a/blog/2025-03-14/img/submit_job.png b/blog/2025-03-14/img/submit_job.png
new file mode 100644
index 0000000..eb1d0bb
Binary files /dev/null and b/blog/2025-03-14/img/submit_job.png differ
diff --git a/blog/2025-03-28/1k_threads.png b/blog/2025-03-28/1k_threads.png
new file mode 100644
index 0000000..3c58e5f
Binary files /dev/null and b/blog/2025-03-28/1k_threads.png differ
diff --git a/blog/2025-03-28/cpu_thread.png b/blog/2025-03-28/cpu_thread.png
new file mode 100644
index 0000000..b4ffed1
Binary files /dev/null and b/blog/2025-03-28/cpu_thread.png differ
diff --git a/blog/2025-03-28/process_by_htop.png b/blog/2025-03-28/process_by_htop.png
new file mode 100644
index 0000000..e91c306
Binary files /dev/null and b/blog/2025-03-28/process_by_htop.png differ
diff --git a/blog/2025-03-28/simple.png b/blog/2025-03-28/simple.png
new file mode 100644
index 0000000..4b54b9f
Binary files /dev/null and b/blog/2025-03-28/simple.png differ
diff --git a/blog/2025-03-28/stop-using-python-thread.md b/blog/2025-03-28/stop-using-python-thread.md
new file mode 100644
index 0000000..58d9d82
--- /dev/null
+++ b/blog/2025-03-28/stop-using-python-thread.md
@@ -0,0 +1,211 @@
+---
+slug: python-thread
+title: "Python thread: deep dive"
+authors: [river]
+tags: [python]
+---
+
+Modern computers are designed to handle multitasking, enabling you to run multiple programs simultaneously. But have you ever wondered how computers manage this complexity?
+
+In programming, Python is one of the most popular languages, and it supports multitasking through multiple processes and threads. However, Python has a unique feature that might lead to inefficient usage if not understood properly. Let’s dive in and explore.
+
+
+**Figure 1:** A system monitor displaying process IDs (PIDs), user ownership, and resource consumption (e.g., memory and CPU).
+
+
+
+## What Is a Process in Computing?
+
+A process is an instance of a program being executed by a computer. It includes the program's code, data, and allocated system resources. Each process operates **independently** in its own memory space and can spawn multiple threads for concurrent execution.
+
+The operating system is responsible for managing processes, enabling them to communicate, **schedule CPU time**, and **share resources** efficiently.
+
+
+## Thread in python
+In Python, a thread runs within a single process and is managed by that process. A single process can run multiple threads, and these threads share computing
+resources such as CPU and memory. However, due to Python’s Global Interpreter Lock (GIL), threads are restricted to executing on a single CPU core at a time,
+even if multiple CPU cores are available.
+:::info
+What, why and how python requires GIL ?
++ The Global Interpreter Lock (GIL) is a mechanism in CPython that ensures only one thread executes Python bytecode at a time, even on multi-core processors.
++ It prevents race conditions in memory management but limits true parallel execution for CPU-bound tasks. To achieve real parallelism, multiprocessing should be used instead of threading.
+:::
+
+## What is GIL and how thread in python work ?
+:::warning
+This is the proof of concept for the python3 lower than 3.13. From python 3.13, the GIL can be disabled, with evenly higher performance with cpu-bound tasks while sharing the same resource (memory,etc)
+:::
+
+The **Global Interpreter Lock (GIL)** is a mechanism in Python that manages thread execution by allowing only **one thread to run at a time**, even on multi-core processors.
+Although multiple threads can be created, they do not achieve true parallelism for CPU-bound tasks due to the GIL.
+
+However, Python threads can still improve performance for **I/O-bound tasks** like downloading or uploading files because the CPU spends most of its time waiting for external resources.
+The Python interpreter quickly **switches between threads**, saving and restoring their states, making it appear as if multiple threads are running simultaneously.
+In reality, the **CPU rapidly cycles through threads**, ensuring that tasks that do not require significant computation feel like they are running in parallel.
+
+
+## Simple process with thread in python
+### Simple program
++ A simple function in python, create 2 files as below `single_worker.py`. Then you can duplicate the `worker` step, it is similar in real world problem when you have IO tasks likes
+downloading multiple files
++ Here to make it easier, I uses `sleep` which is similar.
+```bash
+download_file()
+download_file()
+```
+
+**single_worker.py**
+```python
+import time
+
+def worker(task_id, duration):
+
+ """Simulate work by sleeping for `duration` seconds."""
+ print(f"Thread-{task_id} started")
+ time.sleep(duration)
+ print(f"Thread-{task_id} finished")
+ print
+# execute
+start_time = time.time()
+worker(1,1)
+end_time = time.time()
+print(f"Total execution time: {end_time - start_time:.2f} seconds")
+```
+
+**sequential_worker.py**
+```python
+import time
+
+def worker(task_id, duration):
+
+ """Simulate work by sleeping for `duration` seconds."""
+ print(f"Thread-{task_id} started")
+ time.sleep(duration)
+ print(f"Thread-{task_id} finished")
+ print
+# execute
+start_time = time.time()
+worker(1,1)
+# duplicate
+worker(1,1)
+end_time = time.time()
+print(f"Total execution time: {end_time - start_time:.2f} seconds")
+```
+
+Run and test the time to executes
+```bash
+python3 single_worker.py
+python3 sequential_worker.py
+```
+
+
+
+**Figure 2:** Execute the simple programs
+
+:::info
++ For these functions, it is bad practice run them sequentially. You may automatically to use the `loop`, it reduces number of lines in script,
+but it takes your times while you can do it **much more better**
+:::
+
+### Multiple threads
++ Create the file with content below, name it with `multiple_thread.py`
+```python
+import sys
+import threading
+import time
+
+def worker(task_id, duration):
+ """Simulate work by sleeping for `duration` seconds."""
+ print(f"Thread-{task_id} started")
+ time.sleep(duration)
+ print(f"Thread-{task_id} finished")
+
+def run_threads(num_threads, duration):
+ """Create and start multiple threads."""
+ threads = []
+
+ start_time = time.time()
+
+ for i in range(num_threads):
+ thread = threading.Thread(target=worker, args=(i, duration))
+ threads.append(thread)
+ thread.start()
+
+ for thread in threads:
+ thread.join() # Wait for all threads to finish
+
+ end_time = time.time()
+ print(f"Total execution time: {end_time - start_time:.2f} seconds")
+
+if __name__ == "__main__":
+ run_threads(sys.argv[1], 1)
+```
++ Run with threads
+```bash
+python3 multiple_thread.py
+
+```
+
+
+**Figure 3:** Multiple threads that you don't need to wait them sequentially
+
+
+### Number of cpus and number of threads ?
++ It’s useful to see how multiple threads can improve execution time, but the performance gain does not depend on the number of CPUs.
+I ran the program with 1000 threads, and the results showed that for non-CPU-bound tasks, using many threads is fine—but excessive threads should be used with caution.
+
++ Even with more threads, they still need to share resources like internet bandwidth, and switching between threads adds overhead.
+As a result, the execution time was 1.23s instead of 1s, showing that too many threads can slow things down.
+
++ In general, for CPU-bound tasks, the number of threads should match the number of CPUs.
+However, for I/O-bound tasks like network requests, you can experiment with more threads to maximize efficiency.
+Instead of waiting 1000 seconds, the task completed in 1.23s—a significant speedup! 🚀
+
+
+
+**Figure 3:** 1000 threads in 1.23s
+
+
+### Bump! When you use threads with cpus tasks ?
+It does not reduces time to execute. Because it requires process to use a cpu to compute. Overall, it should not be used in cpu-bound tasks.
+Create a file with content as below, name it with `cpu_thread.py`.
+:::warning
+Avoid using threads for CPU-bound tasks, as they often increase execution time without providing linear performance improvements.
+:::
+```python
+import threading
+import time
+import sys
+
+def cpu_task(n):
+ """Simulate a CPU-bound task by performing heavy computations."""
+ print(f"Thread-{n} started")
+ count = 0
+ for _ in range(10**7): # Simulate CPU-intensive work
+ count += 1
+ print(f"Thread-{n} finished")
+
+def run_threads(num_threads):
+ """Create and start multiple CPU-bound threads."""
+ threads = []
+
+ start_time = time.time()
+
+ for i in range(num_threads):
+ thread = threading.Thread(target=cpu_task, args=(i,))
+ threads.append(thread)
+ thread.start()
+
+ for thread in threads:
+ thread.join() # Wait for all threads to finish
+
+ end_time = time.time()
+ print(f"Total execution time: {end_time - start_time:.2f} seconds")
+
+if __name__ == "__main__":
+ run_threads(int(sys.argv[1]))
+```
+
+
+
+**Figure 4:** Use threads are crazy in python with cpu bound tasks
\ No newline at end of file
diff --git a/blog/2025-03-28/thread.png b/blog/2025-03-28/thread.png
new file mode 100644
index 0000000..b352881
Binary files /dev/null and b/blog/2025-03-28/thread.png differ
diff --git a/blog/2025-03-29/galaxy_local.png b/blog/2025-03-29/galaxy_local.png
new file mode 100644
index 0000000..9500c55
Binary files /dev/null and b/blog/2025-03-29/galaxy_local.png differ
diff --git a/blog/2025-03-29/login_and_run_web.png b/blog/2025-03-29/login_and_run_web.png
new file mode 100644
index 0000000..42e9e79
Binary files /dev/null and b/blog/2025-03-29/login_and_run_web.png differ
diff --git a/blog/2025-03-29/remote_server.webp b/blog/2025-03-29/remote_server.webp
new file mode 100644
index 0000000..dd00ca0
Binary files /dev/null and b/blog/2025-03-29/remote_server.webp differ
diff --git a/blog/2025-03-29/ssh-remote-tunnelling.md b/blog/2025-03-29/ssh-remote-tunnelling.md
new file mode 100644
index 0000000..9d93ad3
--- /dev/null
+++ b/blog/2025-03-29/ssh-remote-tunnelling.md
@@ -0,0 +1,58 @@
+---
+slug: ssh-remote-tunnel
+title: "SSH remote tunnel"
+authors: [river]
+tags: [ hpc, ssh, network, slurm ]
+---
+
+When working on a remote High-Performance Computing (HPC) cluster or a cloud server, accessing development tools locally can be challenging.
+One effective approach is to use an SSH tunnel to securely access a galaxy server-a web platform as if it were running on your local machine.
+
+
+
+
+
+:::info
++ Replace with your host and port for ssh
++ Run any web on specific port, then forward this web service via ssh to access at your local machine
++ Require to install [**docker**](https://docs.docker.com/engine/install/). For more details on, please visit [**galaxy docker**](https://github.com/bgruening/docker-galaxy)
+:::
+## Why Use SSH Tunneling?
+SSH tunneling allows you to securely forward ports from a remote server to your local system. This is useful for accessing services that are running on the remote machine without exposing them to the internet.
+
+
+
+
+## Login to your remote server
+```bash
+ssh river@platform.riverxdata.com
+```
+
+## Run a web service
+It can be accessed at localhost 8080
+Run galaxy server- a web platform for bioinformatics:
+```bash
+docker run -d -p 8080:80 \
+ -v ./galaxy_storage/:/export/ \
+ quay.io/bgruening/galaxy
+```
+
+The web is now available on a remote system on the port 8080, docker bind the port of the web is running at 80 on the container.
+
+## Test the web service
+Test the web service at 8080
+```bash
+curl localhost -p 8080
+```
+
+
+**Figure 1:** Login and start galaxy server using docker on remote machine
+
+Now you can access the galaxy at your local computer via ssh. With the local machine will access this network at port 8081 which is remoted from 8080 of the `platform.riverxdata.com`
+```bash
+ssh -N -L 8081:localhost:8080 platform.riverxdata.com
+```
+
+Open your web brower to see how it work
+
+**Figure 2:** Access your web service at local machine
\ No newline at end of file
diff --git a/blog/2025-03-30/micromamba.png b/blog/2025-03-30/micromamba.png
new file mode 100644
index 0000000..071e6bc
Binary files /dev/null and b/blog/2025-03-30/micromamba.png differ
diff --git a/blog/2025-03-30/setup-remote-server.md b/blog/2025-03-30/setup-remote-server.md
new file mode 100644
index 0000000..418aae1
--- /dev/null
+++ b/blog/2025-03-30/setup-remote-server.md
@@ -0,0 +1,180 @@
+---
+slug: setup-shell-remote-server
+title: "Hack your shell environment"
+authors: [river]
+tags: [ zsh, river, micromamba ]
+---
+
+
+Setting up a shell environment on a remote server can be a tedious process, especially when dealing with multiple dependencies. This one also can be used for your local machine.
+This guide walks you through installing **micromamba**, setting up a River environment, and configuring useful tools like **goofys** to work with cloud storage (S3) and zsh (amazing shell)
+
+
+
+## Set up variables
+It will define a hard version for installing goofys, micromamba. You can adjust to update latest version
+```bash
+echo "River software dependencies setup"
+MICROMAMBA_VERSION=2.0.5
+GOOFYS_VERSION=0.24.0
+RIVER_BIN=$HOME/.river/bin
+mkdir -p $RIVER_BIN
+```
+
+## Install micromamba
+Micromamba is a lightweight, fast alternative to Conda for managing environments and packages.
+It is a small, standalone executable that provides the same package management features as Conda but with much lower overhead. Unlike Conda, Micromamba does not require a full Anaconda installation, making it ideal for minimal setups, CI/CD pipelines, and remote servers.
+
+:::info
+Key Features:
+
++ Fast and lightweight: Much smaller than Conda, with a quick installation process.
++ Standalone executable: No need for a full Conda installation.
++ Supports Conda environments: Fully compatible with Conda packages and repositories.
++ Works in remote/cloud environments: Ideal for automation and scripting
+:::
+
+Create an environmen call `river` that helps to install common software without required `sudo` permision, install based on user space:
++ Python (from Anaconda) for scripting and development.
++ R-base for statistical computing and bioinformatics applications.
++ Singularity (v3.8.6) for containerized workflows. The singularity has the image is similar to a file, it is portable, has a great integration with the host system
++ Nextflow to enable scalable and reproducible scientific workflows.
++ Zsh for a better shell experience.
+
+AWS CLI to interact with AWS services like S3.
+```bash
+# Base softwares
+# micromamba
+export HOME=$HOME
+export MICROMAMBA_EXECUTE=$HOME/.river/bin
+export PATH=$HOME/.river/bin:$PATH
+mkdir -p $MICROMAMBA_EXECUTE
+if [ ! -f "$RIVER_BIN/micromamba" ]; then
+ echo "Installing micromamba..."
+ curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/$MICROMAMBA_VERSION | tar -xvj bin/micromamba
+ mv bin/micromamba $RIVER_BIN/micromamba
+ rm -rf bin
+else
+ echo "micromamba already exists at: $HOME/.river/bin"
+fi
+# Start up
+export PATH="$HOME/.river/bin:$PATH"
+export MAMBA_ROOT_PREFIX="$HOME/.river/images/micromamba"
+# Create the singularity dir
+mkdir -p $HOME/.river/images/singularities/images
+
+# Create micromamba environment and install river-utils
+micromamba create -n river \
+ anaconda::python \
+ conda-forge::r-base \
+ conda-forge::singularity=3.8.6 \
+ bioconda::nextflow \
+ conda-forge::zsh \
+ conda-forge::awscli \
+ -y
+```
+
+### Activate environment
+To install additional software, you can install it in the `river` environment or create a new environment. Below is an example of how to activate the `river` environment and install additional software.
+It is based on conda, what you need to do is searching for your softwares. Almost are hosted on the [**conda**](https://anaconda.org/). Find the command and replace `conda` with `micromamba`.
+
+```bash
+# Activate the river environment
+eval "$(micromamba shell hook --shell bash)"
+micromamba activate river
+
+# Install additional software
+micromamba install -n river \
+ conda-forge::htop \
+ conda-forge::jq \
+ conda-forge::tree \
+ -y
+```
+### Create new environment
+```bash
+# Create a new environment and install Python
+micromamba create -n new_env \
+ anaconda::python=3.9 \
+ -y
+
+# Activate the new environment
+micromamba activate new_env
+
+# Verify Python installation
+python --version
+```
+
+
+**Figure 1:** Micromamba helps to setup your perfect working environments on remote server
+
+In this example:
+- `new_env`: The name of the new environment.
+- `python=3.9`: Specifies the version of Python to install. You can adjust the version as needed.
+
+This allows you to create isolated environments for different projects or dependencies. It is good practice to install tools in different environemnts.
+If you develop web application, install `npm`, `python`, `java`, etc on different environments
+
+## Goofys fuse file system for S3
+Follow blog with tag goofys for more information. It is simply making a cloud storage (compatible S3) to be worked as local file system with nearly full POSIX support
+```bash
+# goofys
+if [ ! -f "$RIVER_BIN/goofys" ]; then
+ echo "Installing goofys..."
+ curl -L https://github.com/kahing/goofys/releases/download/v${GOOFYS_VERSION}/goofys -o $RIVER_BIN/goofys
+ chmod +x $RIVER_BIN/goofys
+else
+ echo "Goofys already exists at: $HOME/.river/bin"
+ goofys --help 2> /dev/null
+fi
+```
+
+## Improve your shell experience
+Install the zsh and related extensions. Here is the standard ones, but you can find more about **zsh**[https://ohmyz.sh/]
+
+```bash
+eval "$(micromamba shell hook --shell bash)"
+micromamba activate river
+sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)" "" --unattended
+
+# plugins
+git clone https://github.com/zsh-users/zsh-autosuggestions ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-autosuggestions
+git clone https://github.com/zsh-users/zsh-syntax-highlighting.git ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-syntax-highlighting
+
+# Update .zshrc
+echo "Updating .zshrc..."
+sed -i "s|plugins=(git)|plugins=(\n git\n docker\n docker-compose\n history\n rsync\n safe-paste\n zsh-autosuggestions\n zsh-syntax-highlighting\n)\n|" ~/.zshrc
+source ~/.zshrc
+```
+
+Now you can see how it improve your experiences;
++ Your systax is highlighted to show it correctly work
++ Your history can be quickly reused
+
+
+
+
+**Figure 2:** zsh improves your shell experiences
+
+## Activate all by `.river.sh`
+Create the file called `.river.sh` then you can activate all standard setup
+
+```bash
+# Create .river.sh for environment variables
+cat < $HOME/.river.sh
+export HOME=${HOME}
+export HOME_TOOLS=\${HOME}/.river/bin
+export MAMBA_ROOT_PREFIX=\${HOME}/.river/images/micromamba
+export SINGULARITY_CACHE_DIR=\${HOME}/.river/images/singularities
+export NXF_SINGULARITY_CACHEDIR=\$SINGULARITY_CACHE_DIR/images
+export PATH=\${HOME_TOOLS}:\$PATH
+eval "\$(micromamba shell hook -s posix)"
+micromamba activate -n river
+zsh
+source ~/.zshrc
+EOF
+```
+
+You can activate absolutely when you are login by `~/.bashrc`
+```bash
+source ~/.river.sh
+```
diff --git a/blog/2025-03-30/zsh.png b/blog/2025-03-30/zsh.png
new file mode 100644
index 0000000..06eb4b6
Binary files /dev/null and b/blog/2025-03-30/zsh.png differ
diff --git a/blog/authors.yml b/blog/authors.yml
new file mode 100644
index 0000000..d723fbb
--- /dev/null
+++ b/blog/authors.yml
@@ -0,0 +1,10 @@
+river:
+ name: Thanh-Giang (River) Tan Nguyen
+ title: Software and bioinformatics engineer
+ url: https://www.facebook.com/nttg8100
+ image_url: https://avatars.githubusercontent.com/u/64969412?v=4
+ page: true
+ email: nttg8100@gmail.com
+ socials:
+ linkedin: https://www.linkedin.com/in/thanh-giang-tan-nguyen-761b28190/
+ github: nttg8100
diff --git a/blog/tags.yml b/blog/tags.yml
new file mode 100644
index 0000000..45f6f11
--- /dev/null
+++ b/blog/tags.yml
@@ -0,0 +1,19 @@
+facebook:
+ label: Facebook
+ permalink: /facebook
+ description: Facebook tag description
+
+hello:
+ label: Hello
+ permalink: /hello
+ description: Hello tag description
+
+docusaurus:
+ label: Docusaurus
+ permalink: /docusaurus
+ description: Docusaurus tag description
+
+hola:
+ label: Hola
+ permalink: /hola
+ description: Hola tag description
diff --git a/docusaurus.config.ts b/docusaurus.config.ts
index f4680e2..02bd6a4 100644
--- a/docusaurus.config.ts
+++ b/docusaurus.config.ts
@@ -38,9 +38,6 @@ const config: Config = {
{
docs: {
sidebarPath: "./sidebars.ts",
- // Please change this to your repo.
- // Remove this to remove the "edit this page" links.
- editUrl: "https://github.com/riverxdata",
},
blog: {
showReadingTime: true,
@@ -50,8 +47,6 @@ const config: Config = {
},
// Please change this to your repo.
// Remove this to remove the "edit this page" links.
- editUrl: "https://github.com/riverxdata",
- // Useful options to enforce blogging best practices
onInlineTags: "warn",
onInlineAuthors: "warn",
onUntruncatedBlogPosts: "warn",
@@ -93,6 +88,7 @@ const config: Config = {
position: "left",
label: "Documentation",
},
+ { to: "/blog", label: "Blog", position: "left" },
{
href: "https://github.com/riverxdata/river",
label: "GitHub",