RFC: [PATCH hurd 3/6] Add file record locking support: libtrivfs_file_record

Discussion:

RFC: [PATCH hurd 3/6] Add file record locking support: libtrivfs_file_record_lock.patch

Svante Signell

2016-02-08 11:53:56 UTC

Proxy implementation added to the original patches.

Note: This proxy version does not work, e.g. set_fcntl /dev/null gives error
code EACCES. The problem has been traced down to fsServer.c:

3386: if (MACH_MSGH_BITS_LOCAL (In0P->Head.msgh_bits) == MACH_MSG_TYPE_PROTECTED_PAYLOAD)
3387: file = diskfs_begin_using_protid_payload(In0P>Head.msgh_protected_payload);
3388: else
3389: file = diskfs_begin_using_protid_port(In0P->Head.msgh_request_port);
3390:
3391: OutP->RetCode = diskfs_S_file_record_lock(file, In0P->cmd, &In0P->flock64);
3392: diskfs_end_using_protid_port(file);

Test case: set-fcntl /dev/null
After the call on line 3387
file->po->openstat=0
while for set-fnctl foo it is
file->po->openstat=3

Setting it to 3 in gdb makes the program finish successfully.

Comments:
- The proxy implementation obviously does not work. How to activate the else
part for testing purposes?
- I have a working solution, which adds two new structs to struct trivfs_peropen
struct rlock_peropen lock_status;
struct trivfs_node *tp;

This solution was not accepted, and maybe finding the bug in
diskfs_begin_using_protid_payload generated by MIG? would be much better.

Justus Winter

2016-02-08 13:19:56 UTC

Permalink

Quoting Svante Signell (2016-02-08 12:53:56)

Post by Svante Signell
Proxy implementation added to the original patches.

Note: This proxy version does not work, e.g. set_fcntl /dev/null gives error
3386: if (MACH_MSGH_BITS_LOCAL (In0P->Head.msgh_bits) == MACH_MSG_TYPE_PROTECTED_PAYLOAD)
3387: file = diskfs_begin_using_protid_payload(In0P>Head.msgh_protected_payload);
3388: else
3389: file = diskfs_begin_using_protid_port(In0P->Head.msgh_request_port);
3391: OutP->RetCode = diskfs_S_file_record_lock(file, In0P->cmd, &In0P->flock64);
3392: diskfs_end_using_protid_port(file);
Test case: set-fcntl /dev/null
After the call on line 3387
file->po->openstat=0
while for set-fnctl foo it is
file->po->openstat=3
Setting it to 3 in gdb makes the program finish successfully.
- The proxy implementation obviously does not work. How to activate the else
part for testing purposes?
- I have a working solution, which adds two new structs to struct trivfs_peropen
struct rlock_peropen lock_status;
struct trivfs_node *tp;
This solution was not accepted, and maybe finding the bug in
diskfs_begin_using_protid_payload generated by MIG? would be much better.

Nack.

As I said on IRC, there is no evidence that the protected payload
mechanism is not working as expected. We are using it for every
single RPC that a Hurd server services since Hurd 0.6.

But if you insist, take an old gnumach that doesn't do the protected
payload thing and use it to boot. Protected payloads are used as an
optimization, and the Hurd will fall back to the hash table lookup.

That openstat is 0 has nothing to do with protected payloads. I
believe it is because the underlying node of the 'null' translator is
opened with flags=0.

Justus

Svante Signell

2016-02-08 13:44:09 UTC

Permalink

Post by Justus Winter
Quoting Svante Signell (2016-02-08 12:53:56)

Post by Svante Signell
Proxy implementation added to the original patches.

As I said on IRC, there is no evidence that the protected payload
mechanism is not working as expected. We are using it for every
single RPC that a Hurd server services since Hurd 0.6.

You are probably right.

Post by Justus Winter
That openstat is 0 has nothing to do with protected payloads. I
believe it is because the underlying node of the 'null' translator is
opened with flags=0.

Are you referring to the nullauth() call in null.c here? I've tried to remove
that call without anything changing.

Where to find the code setting the underlying node of the null translator flags
to zero?

Justus Winter

2016-02-08 14:04:14 UTC

Permalink

Quoting Svante Signell (2016-02-08 14:44:09)

Post by Svante Signell

Post by Justus Winter
That openstat is 0 has nothing to do with protected payloads. I
believe it is because the underlying node of the 'null' translator is
opened with flags=0.

Are you referring to the nullauth() call in null.c here? I've tried to remove
that call without anything changing.

No.

Post by Svante Signell
Where to find the code setting the underlying node of the null translator flags
to zero?

It starts with /dev/null having a passive translator record. If that
is first accessed, diskfs_S_dir_lookup will start the translator on
demand. Somewhere between diskfs_S_dir_lookup, fshelp_fetch_root,
fshelp_start_translator_long, there is a callback that opens the
underlying node.

Justus

Samuel Thibault

2018-11-18 00:13:38 UTC

Permalink

Hello,

Triggered by an IRC discussion, I had a look at this old mail still in

Post by Justus Winter
Quoting Svante Signell (2016-02-08 14:44:09)

Post by Svante Signell
Where to find the code setting the underlying node of the null translator flags
to zero?

More precisely, it's trivfs_startup which calls fsys_startup to get the
underlying node. It's just passing its flags parameter, which happens to
be 0 in basically all translators, except the term translator in some
case. I don't really know why, except probably potential for spurious
side effects.

Quoting Svante Signell (2016-02-08 12:53:56)

Post by Justus Winter
- I have a working solution, which adds two new structs to struct trivfs_peropen
struct rlock_peropen lock_status;
struct trivfs_node *tp;
This solution was not accepted

I don't remember the discussion which refused this solution. I guess
your working solution is to implement the record-lock in trivfs
itself? It sort of makes sense to me actually: the data of the
underlying node and the data of the node exposed by the translator
are not really related, I don't see why we should necessarily proxy
the lock. Apparently the only parts which are proxied are file access
permissions and time, i.e. information of the inode itself.

Samuel

Svante Signell

2018-11-18 12:44:40 UTC

Permalink

Post by Samuel Thibault
Hello,
Triggered by an IRC discussion, I had a look at this old mail still
Quoting Svante Signell (2016-02-08 12:53:56)

Post by Svante Signell
- I have a working solution, which adds two new structs to struct trivfs_peropen
struct rlock_peropen lock_status;
struct trivfs_node *tp;
This solution was not accepted

Do you say that you are suddenly interested in the proposal I made in
2016? That proposal was completely dismissed by Justus by then! I don't
even know if i have these modifications available somewhere, I've had
several crashes with data losses since 2016.

Samuel Thibault

2018-11-18 12:50:39 UTC

Permalink

Post by Svante Signell

Post by Samuel Thibault
I don't remember the discussion which refused this solution. I guess
your working solution is to implement the record-lock in trivfs
itself? It sort of makes sense to me actually: the data of the
underlying node and the data of the node exposed by the translator
are not really related, I don't see why we should necessarily proxy
the lock. Apparently the only parts which are proxied are file access
permissions and time, i.e. information of the inode itself.

Do you say that you are suddenly interested in the proposal I made in
2016?

Well, yes?

Just to be clear: it's not "sudden", I had that couple of mails between
Justus and you in my inbox since then, burried behind hundreds of other
mails. The IRC discussion just raised that something was blocked there.

Samuel

Svante Signell

2018-12-01 15:11:32 UTC

Permalink

Post by Samuel Thibault

Post by Svante Signell

Post by Samuel Thibault
I don't remember the discussion which refused this solution.

Found some stuff in IRC logs:
2015-12-20:
14:43:53< gnu_srs1> youpi: So it's OK to add struct rlock_peropen lock_status to
struct trivfs_peropen?
14:47:07< youpi> gnu_srs1: rlock: it should be fine, yes

2016-01-07:
11:13:35< gnu_srs> The only translator calling trivfs_open is magic. Maybe the
creation of trivfs_node should be added directly to file_record_lock, file_lock
and/or file_lock_stat.
11:14:04< gnu_srs> There is a call to trivfs_open also in term/users.c
11:16:56< teythoon> that might be b/c magic is the only trivfs translator
exposing a directory
11:21:33< gnu_srs> What about adding the creation of trivfs_node directly?
11:45:33< gnu_srs> In case of single file translators, like null, is struct
protid/struct peropen not needed?
11:47:48< teythoon> gnu_srs: yes it is
11:48:22< teythoon> a peropen object is needed for each file handle that is
opened, hence the name
11:48:29< teythoon> but there is only one node object
11:48:34< teythoon> that must be initialized somewhere
11:52:47< gnu_srs> since initialisation cannot be placed in trivfs_open, can it
be put directly in the routines: file_record_lock, file_lock and/or
file_lock_stat?
12:09:01< teythoon> the object should be initialized where it is created
12:09:37< teythoon> if it is static (e.g. a global variable), then it should be
initialized at program start
12:13:46< teythoon> gnu_srs: actually, libtrivfs doesn't do the actual node
handling, it proxies everything to the underlying node
12:14:10< teythoon> you should make sure that the locking rpc get's proxied, and
then libtrivfs is fine i guess
12:15:17< gnu_srs> How to do that?
12:16:26< teythoon> what is the name of the rpc call ?
12:16:51< teythoon> see e.g. libtrivfs/file-chmod.c
12:19:01< gnu_srs> The new RPC is file_record_lock, two others are file_lock and
file_lock_stat
12:20:46< teythoon> try to forward them like file_chmod is forwarded
12:20:54< gnu_srs> So the proxy returns the call to the RPC?
12:21:28< teythoon> don't worry about that, the messy underlying details are
hidden thanks to MIG
12:21:44< teythoon> you just call a function
12:22:11< gnu_srs> OK, I'll try. HTH :-D
14:22:34< gnu_srs> teythoon: ./checklock /dev/null: ./checklock:
file_record_lock: Permission denied also as root. Is this due to setnullauth()?
14:43:16< teythoon> gnu_srs: hum, possibly, i don't know
14:43:32< teythoon> you can easily check this though, just don't call
setnullauth
14:49:48< gnu_srs> No, it was not due to nullauth :(
14:53:48< gnu_srs> Back to the more complex patch then, or?
15:04:27< gnu_srs> How does this proxy stuff work, e.g. libtrivfs/file-chmod.c
(trivfs_S_file_chmod) ?
15:43:33< teythoon> gnu_srs: i'm still pretty sure that proxying is the right
thing to do (if it is for file_chmod, then also for this)
15:49:42< gnu_srs> How does this proxying work: Which implementation is
executed?
16:21:24< gnu_srs> Ah I think I'm getting closer with the proxy solution. In
trivfs.h the new struct trivfs_node has to be defined. How should it be defined:
Like in diskfs.h or netfs.h, or?
16:21:45< teythoon> no
16:22:03< teythoon> libtrivfs has no node type
16:22:21< teythoon> instead, it delegates the file_* rpcs to the underlying node
16:22:25< teythoon> like file_chmod
6:43:03< gnu_srs> The function file_record_lock calls fshelp_rlock_tweak which
has arguments: struct rlock_box *box,..., struct rlock_peropen *po,...
16:45:59< gnu_srs> Here is the implementation of trivfs_S_file_record_lock:
return cred ? file_record_lock (cred->realnode, cmd, lock) : EOPNOTSUPP;
16:46:17< teythoon> yes
16:49:35< gnu_srs> See the line above too.
17:28:47< teythoon> gnu_srs: what about it ?
17:30:09< gnu_srs> How are these structs created and accessed?
17:30:47< gnu_srs> with the underlying trick
17:31:52< gnu_srs> Which implementation of file_record_lock is called?
17:34:42< teythoon> well, if you do settrans -a /dev/null /hurd/null, then
/dev/null (the node on some ext2fs) is the underlying node handed to /hurd/null
17:35:28< teythoon> so, if /hurd/null relays hte request as it does for
file_chmod, then the ext2fs that /dev/null is on is the receiver
17:36:19< teythoon> for details how this works, see hurd/fsys.defs
17:37:20< teythoon> see fsys_startup in particular
17:37:56< gnu_srs> k!
17:40:15< gnu_srs> But /dev/null is not a regular file?
17:41:59< gnu_srs> The error I get is EACCES, not EOPNOTSUPP.
17:43:14< gnu_srs> file /dev/null: /dev/null: character special (0/0) (when
translated by /hurd/null)
18:41:58< gnu_srs> (18:36:49) teythoon: so, if /hurd/null relays hte request as
it does for file_chmod, then the ext2fs that /dev/null is on is the receiver
18:41:59< gnu_srs> If you mean the device file /dev/null itself should it fall
back to the libtrivfs implementation of file_record_lock, or?
18:42:35< gnu_srs> s/libtrivfs/libdiskfs/g
18:50:59< teythoon> yes

2016-01-08:
11:07:22< gnu_srs> Regarding the proxy solution, I'll look into that
later on. Now I have a working solution by adding a node struct to trivfs.h
11:12:48< gnu_srs> I did implement the proxy version too, and obtain
EACCES as a response.
12:00:44< teythoon> the other "solution" is no solution b/c it is
likely just wrong

2016-01-13:
11:37:35< gnu_srs> teythoon: I've traced the EACCES to fsServer.c:
file =
diskfs_begin_using_protid_payload(In0P->Head.msgh_protected_payload)
11:37:35< gnu_srs> for checklock foo file->po->openstat=3 while for
checklock /dev/null it is zero
11:42:42< gnu_srs> and setting it to 3 (O_RDWR) in gdb make the test
finish OK
11:59:49< teythoon> gnu_srs: i guess the issue is that the underlying
file is not writable, and hence the new locking rpc returns EACCESS
when used on the underlying file

Post by Samuel Thibault

Post by Svante Signell

Post by Samuel Thibault
I guess your working solution is to implement the record-lock in trivfs
itself? It sort of makes sense to me actually: the data of the
underlying node and the data of the node exposed by the translator
are not really related, I don't see why we should necessarily proxy
the lock. Apparently the only parts which are proxied are file access
permissions and time, i.e. information of the inode itself.

Do you say that you are suddenly interested in the proposal I made in
2016?

Well, yes?

The updated (non-proxy) implementation now works fine. It does no longer crash
the null translator :)

Samuel Thibault

2018-12-01 17:50:31 UTC

Permalink

[...]

Post by Svante Signell
12:13:46< teythoon> gnu_srs: actually, libtrivfs doesn't do the actual node
handling, it proxies everything to the underlying node

Well, not everything :) notably not the content.

Post by Svante Signell
12:14:10< teythoon> you should make sure that the locking rpc get's proxied, and
then libtrivfs is fine i guess

I don't think we want to proxy the record locking, as it's about the
content, which only the translator knows about. The fact that most
translators open the underlying node with access mode 0 (which entails
the issue you are getting) is a sign of this :)

Post by Svante Signell
15:43:33< teythoon> gnu_srs: i'm still pretty sure that proxying is the right
thing to do (if it is for file_chmod, then also for this)

chmod is a different beast, we indeed want the underlying node to
provide the permissions to be used on the translator.

Post by Svante Signell
The updated (non-proxy) implementation now works fine. It does no longer crash
the null translator :)

Good :)

Samuel