QuickStart

Run kisekifs in a light-weight virtual machine.

2. Start a ubuntu virtual machine

create a virtual machine

limactl start \
  --name=kiseki \
  --cpus=2 \
  --memory=8 \
  template://ubuntu-lts

go into the virtual machine

limactl shell kiseki

Install git and download repo

sudo apt-get update -y && sudo apt-get install git

git clone https://github.com/crrow/kisekifs && cd kisekifs

Prepare environments

# Set proxy: export HTTP_PROXY=http://192.168.5.2:7890,HTTPS_PROXY=http://192.168.5.2:7890
sudo bash ./hack/ubuntu/dep.sh

echo 'export PATH=$HOME/.cargo/bin/:$PATH' >> $HOME/.bashrc

Build kisekifs

just build

# command [just mount] will mount kisekifs on /tmp/kiseki
just minio && just mount

FileWriter in VFS

A File is divide into chunks, and each chunk is divide into slices. Each chunk has a fixed size, slice size is not fixed, but it cannot exceed the chunk size. Each slice is a continuous piece of data, sequentially write can extend the slice, for random write, we basically always need to create a new slice. ~~In this way, we convert the random write to sequential write.~~

Actually, it still is random write in current implementation, no matter in write-back cache or not, we need to create a new file or object for each slice block. That introduce a lot of small files.

In Juice's implementation, they employee a background check to flush some files to the remote for avoiding run out of inodes in the local file system. The write-back cache has some optimization to do. Like merge multiple block file into a big segment file like what levelDB does.

Some Limitations

We need to commit each slice order by order

Bench

Just for reference, the benchmark is totally not accurate.

Juicefs's bench

MetaEngine: local rocksdb; ObjectStorage: local minio storage; WriteBackCacheSize: 10G; WriteBufferSize: 2G(1G memory + 1G mmap);

Cleaning kernel cache, may ask for root privilege...

BlockSize: 1 MiB, BigFileSize: 1024 MiB, SmallFileSize: 128 KiB, SmallFileCount: 100, NumThreads: 1

ITEM	VALUE	COST
Write big file	2359.03 MiB/s	0.43 s/file
Read big file	1139.28 MiB/s	0.90 s/file
Write small file	1001.5 files/s	1.00 ms/file
Read small file	6276.5 files/s	0.16 ms/file
Stat file	94324.9 files/s	0.01 ms/file

Why Kiseki?

Kiseki means miracle in Japanese, and I just pick it up from a Japanese song while browsing the internet.

I am just interested in Rust and passionate about distributed systems. Kiseki is my own exploration of building a file system in this language.

Will you continue to develop it?

I don't know yet, the basic storage part is done, develop is fun and I have learned a lot from it. So basically I have achieved my goal. The rest of the work is endless posix compliance and performance optimization and edge case handling, since i'm not employed by any company at present, I have to focus on my job hunting and other stuff... But who knows, we will see.

Contribution

Very welcome, just open a PR or issue.

Why JuiceFS?

While JuiceFS offers a powerful tool for data management, porting it to Rust presents a unique opportunity to delve into the inner workings of file systems and gain a deeper understanding of their design and implementation.

Kiseki