Changes for page Ceph

Last modified by Jonas Jelten on 2024/09/13 15:05

From version 3.1
edited by Jonas Jelten
on 2024/08/23 14:09
Change comment: There is no comment for this version
To version 5.1
edited by Jonas Jelten
on 2024/09/13 15:05
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -53,7 +53,6 @@
53 53  
54 54  Now you have a raw storage device, but you can't yet store files on it, since you are missing a filesystem.
55 55  
56 -
57 57  ## RBD formatting
58 58  
59 59  Now that you have mapped your RBD, we can create file system structures on it.
... ... @@ -65,6 +65,7 @@
65 65  ```
66 66  
67 67  get the newly created filesystem UUID:
67 +
68 68  ```
69 69  sudo blkid /dev/rbdxxx
70 70  ```
... ... @@ -72,6 +72,7 @@
72 72  Now we create an entry in `/etc/fstab` with `noauto` so the below script triggers the mount, and the mount is not done too early in the boot.
73 73  
74 74  `/etc/fstab`:
75 +
75 75  ```
76 76  UUID=your-new-fs-uuid /your/mount/point ext4 defaults,_netdev,acl,noauto,nodev,nosuid,noatime,stripe=1024 0 0
77 77  ```
... ... @@ -79,6 +79,7 @@
79 79  In order to mount this filesystem in your server, we need a mount helper script (otherwise the RBD is not yet mapped on system start when `/etc/fstab` tries to mount it directly during boot).
80 80  
81 81  `/etc/ceph/rbd.d/ORG-rbd/namespacename/rbdname`:
83 +
82 82  ```bash
83 83  #!/bin/bash
84 84  
... ... @@ -89,17 +89,15 @@
89 89  # mount all the filesystems
90 90  mountpoint -q /your/mount/point || mount /your/mount/point
91 91  ```
92 -Mark this script *executable* so `rbdmap` can execute it as post-mapping hook!
93 93  
95 +Mark this script _executable_ so `rbdmap` can execute it as post-mapping hook!
96 +
94 94  To test, either restart `rbdmap.service` or manually call `umount` and `mount` for `/your/mount/point`.
95 95  
96 -
97 97  ## LVM on RBD
98 98  
99 -You can create LVM `pvs` and `lvs` on your RBD. You can use this for read/write caching, for example (see below).
100 -This works like usual, just do `pvcreate` etc.
101 +You can create LVM `pvs` and `lvs` on your RBD. You can use this for read/write caching, for example (see below). This works like usual, just do `pvcreate` etc.
101 101  
102 -
103 103  ## RBD tuning
104 104  
105 105  To get more performance, there's some useful tweaks
... ... @@ -109,6 +109,7 @@
109 109  When your server is sufficiently shielded behind firewalls and it isn't susceptible to attacks, disable the cpu bug mitigations for a performance boost as a kernel command line parameter:
110 110  
111 111  `/etc/default/grub`:
112 +
112 112  ```
113 113  GRUB_CMDLINE_LINUX_DEFAULT="mitigations=off"
114 114  ```
... ... @@ -115,9 +115,10 @@
115 115  
116 116  ### Read-Ahead
117 117  
118 -We read ahead 1MiB, since Ceph stores the objects in 4MiB blocks anyway.
119 +We read ahead 1MiB, since Ceph stores the objects in 4MiB blocks anyway. We also allow more parallel requests and use no IO scheduler (since Ceph is distributed with equal latency anyway).
119 119  
120 120  `/etc/udev/rules.d/90-ceph-rbd.rules`:
122 +
121 121  ```
122 122  KERNEL=="rbd[0-9]*", ENV{DEVTYPE}=="disk", ACTION=="add|change", ATTR{bdi/read_ahead_kb}="1024" ATTR{queue/scheduler}="none" ATTR{queue/wbt_lat_usec}="0" ATTR{queue/nr_requests}="2048"
123 123  ```
... ... @@ -124,12 +124,11 @@
124 124  
125 125  ### LVM-Cache
126 126  
127 -see `man 7 lvmcache`.
128 -We can cache the RBD on a local NVMe for more performance.
129 +see `man 7 lvmcache`. We can cache the RBD on a local NVMe for more performance.
129 129  
130 130  * `/dev/fastdevice` is the name of the local NVMe.
131 131  * `/dev/datavg/datalv` is your name of your existing logical volume containing all the stored data on Ceph.
132 -* we recommend writeback caching
133 +* we recommend read and write caching, and a local fastdevice size of at least 50GiB. the more the better :)
133 133  
134 134  ```bash
135 135  ## setup
... ... @@ -174,3 +174,11 @@
174 174  # remove pv from vg
175 175  lvreduce datavg /dev/fastdevice
176 176  ```
178 +
179 +### NFS tuning
180 +
181 +in /etc/default/nfs-kernel-server:
182 +```
183 +echo "1048576" > /proc/fs/nfsd/max_block_size # allow 1MiB iosize (geht auch noch mehr)
184 +RPCNFSDCOUNT=64 # workeranzahl
185 +```