On Saturday, August 28, 2021 5:43:49 PM CEST Alex Elder wrote:
On 8/16/21 2:50 PM, Fabio M. De Francesco wrote:
Convert greybus/uart.c from IDR to XArray. The abstract data type XArray is more memory-efficient, parallelisable, and cache friendly. It takes advantage of RCU to perform lookups without locking. Furthermore, IDR is deprecated because XArray has a better (cleaner and more consistent) API.
Signed-off-by: Fabio M. De Francesco fmdefrancesco@gmail.com
I have one more comment, below. Generally, I don't think it is important to make this change, but I think it's fine to switch to the newer XArray interface. The result is a little simpler.
I agree that the result of using XArray is a little simpler and readable. As far as performance is regarded (memory-efficiency, cache friendliness, parallelization improvements) I have to take for true the words of Matthew W.. Some time ago I did a similar conversion for staging/unisys/visorhba after discussing with him on IRC; he confirmed that the driver would have got several benefits. This is why I decided to do this work on staging/greybus too.
I cannot affirm the same for IDA to XArray conversions, since IDA are relatively lighter and efficient than IDR. Unfortunately, I cannot profile such conversions in order to prove/disprove they *really* gain on execution time and/or memory footprint.
[]
static int gb_uart_receive_data_handler(struct gb_operation *op) { @@ -341,8 +341,8 @@ static struct gb_tty *get_gb_by_minor(unsigned int minor) { struct gb_tty *gb_tty;
- mutex_lock(&table_lock);
- gb_tty = idr_find(&tty_minors, minor);
- xa_lock(&tty_minors);
I'm basically new to using the XArray interface, but I don't think you really need the xa_lock()/xa_unlock() calls here. You are not relying on reference counting to control when the allocated minor device numbers are freed, so I'm pretty sure you can simply call xa_load() to look up the gb_tty for the given minor device.
I haven't yet had time to understand how driver works. However, I can attempt a response mostly due to logic than to a real knowledge of how drivers work...
(1) I see that alloc_minor is called at "probe" (that I suppose it means the the kernel "feels" that a new device has been added and so it should initialize it somehow and make it ready to operate properly - I hope I'm not too far from the truth :)).
(2) I see that xa_alloc() finds an *unused* identifier and, if it succeeds, that identifier is used as the "minor". So, we have one minor per device and that the same minor cannot be re-assigned to other devices. It also should mean that there's no need for reference counting because that "minor" is not shared.
(3) If the logic above is sound, we have a 1:1 correspondence between minors and devices (max 16 gb_tty's) and therefore we don't need to lock tty_minors because concurrent code passes different minors to xa_load() which always returns different gb_tty's.
If the above argument is wrong I think I should read a book on device drivers for the first time. I have Greg's but I haven't yet opened it for reading :)
Thanks,
Fabio
But please don't only take my word for it; investigate it for yourself, and if needed ask others about it so you're confident it's correct. There is no harm in taking the lock, but if it's not needed, it would be nice to avoid it.
If you conclude the locks are necessary, just say so, and explain why, and I'll probably just accept it. Otherwise, please explain why you are sure they are not needed when you send version 4. Thank you.
-Alex
- gb_tty = xa_load(&tty_minors, minor); if (gb_tty) { mutex_lock(&gb_tty->mutex); if (gb_tty->disconnected) {
@@ -353,19 +353,19 @@ static struct gb_tty *get_gb_by_minor(unsigned int minor) mutex_unlock(&gb_tty->mutex); } }
- mutex_unlock(&table_lock);
- xa_unlock(&tty_minors); return gb_tty;
}