QuickStart

Run kisekifs in a light-weight virtual machine.

1. Install lima

https://github.com/lima-vm/lima

2. Start a ubuntu virtual machine

  1. create a virtual machine
limactl start \
  --name=kiseki \
  --cpus=2 \
  --memory=8 \
  template://ubuntu-lts
  1. go into the virtual machine
limactl shell kiseki
  1. Install git and download repo
sudo apt-get update -y && sudo apt-get install git

git clone https://github.com/crrow/kisekifs && cd kisekifs
  1. Prepare environments
# Set proxy: export HTTP_PROXY=http://192.168.5.2:7890,HTTPS_PROXY=http://192.168.5.2:7890
sudo bash ./hack/ubuntu/dep.sh

echo 'export PATH=$HOME/.cargo/bin/:$PATH' >> $HOME/.bashrc
  1. Build kisekifs
just build
  1. Run
# command [just mount] will mount kisekifs on /tmp/kiseki
just minio && just mount

FileWriter in VFS

Logic Design

A File is divide into chunks, and each chunk is divide into slices. Each chunk has a fixed size, slice size is not fixed, but it cannot exceed the chunk size. Each slice is a continuous piece of data, sequentially write can extend the slice, for random write, we basically always need to create a new slice. In this way, we convert the random write to sequential write.

Actually, it still is random write in current implementation, no matter in write-back cache or not, we need to create a new file or object for each slice block. That introduce a lot of small files.

In Juice's implementation, they employee a background check to flush some files to the remote for avoiding run out of inodes in the local file system. The write-back cache has some optimization to do. Like merge multiple block file into a big segment file like what levelDB does.

Some Limitations

  1. We need to commit each slice order by order

Bench

Just for reference, the benchmark is totally not accurate.

Juicefs's bench

MetaEngine: local rocksdb; ObjectStorage: local minio storage; WriteBackCacheSize: 10G; WriteBufferSize: 2G(1G memory + 1G mmap);

Cleaning kernel cache, may ask for root privilege...

BlockSize: 1 MiB, BigFileSize: 1024 MiB, SmallFileSize: 128 KiB, SmallFileCount: 100, NumThreads: 1

ITEMVALUECOST
Write big file2359.03 MiB/s0.43 s/file
Read big file1139.28 MiB/s0.90 s/file
Write small file1001.5 files/s1.00 ms/file
Read small file6276.5 files/s0.16 ms/file
Stat file94324.9 files/s0.01 ms/file

Why Kiseki?

Kiseki means miracle in Japanese, and I just pick it up from a Japanese song while browsing the internet.

I am just interested in Rust and passionate about distributed systems. Kiseki is my own exploration of building a file system in this language.

Will you continue to develop it?

I don't know yet, the basic storage part is done, develop is fun and I have learned a lot from it. So basically I have achieved my goal. The rest of the work is endless posix compliance and performance optimization and edge case handling, since i'm not employed by any company at present, I have to focus on my job hunting and other stuff... But who knows, we will see.

Contribution

Very welcome, just open a PR or issue.

Why JuiceFS?

While JuiceFS offers a powerful tool for data management, porting it to Rust presents a unique opportunity to delve into the inner workings of file systems and gain a deeper understanding of their design and implementation.