Skip to content

gdrcopy_sanity fails with gdrdrv loaded #334

@ekeever1

Description

@ekeever1

We are experiencing a failure with gdrdrv across multiple systems (x86 and arm).

With gdrdrv-2.5.1-1 installed, the kernel module loads, but gdrcopy_sanity fails. The only log of any kind is in dmesg at load,
[root@illyad log]# modprobe gdrdrv
[691802.796281] gdrdrv:gdrdrv_init:loading gdrdrv version 2.5 built for proprietary NVIDIA driver
[691802.796286] gdrdrv:gdrdrv_init:device registered with major number 490
[691802.796288] gdrdrv:gdrdrv_init:dbg traces disabled, info traces disabled
[691802.796290] gdrdrv:gdrdrv_init:Persistent mapping will be used

[root@illyad log]# modinfo gdrdrv
filename: /lib/modules/4.18.0-553.16.1.el8_10.x86_64/extra/gdrdrv.ko.xz
version: 2.5
description: GDRCopy kernel-mode driver built for proprietary NVIDIA driver
license: Dual MIT/GPL
author: drossetti@nvidia.com
rhelversion: 8.10
srcversion: 628A1EE2623290E9768AEB0
depends: nv-p2p-dummy
name: gdrdrv
vermagic: 4.18.0-553.16.1.el8_10.x86_64 SMP mod_unload modversions
sig_id: PKCS#7
signer: DKMS module signing key
sig_key: 3D:B7:A3:06:14:1E:2D:71:02:9A:5E:B0:CC:BE:08:B5:20:31:91:BA
sig_hashalgo: sha256
signature: 3E:75:F0:7B:F4:19:F4:8E:70:7F:8F:83:A6:5F:F1:91:D4:25:59:06:
7E:BA:94:0E:8A:3C:8F:3E:DA:86:CE:B5:3B:CA:E2:0C:A1:A4:D4:CF:
8C:D2:20:F5:61:E2:E9:B3:38:6B:09:FB:53:91:76:B0:73:C2:13:9E:
0A:6B:21:1A:E2:84:F2:E2:5D:9F:FD:26:8C:87:54:E4:93:DA:91:7D:
AB:2F:26:A9:3A:96:8D:EF:DB:2B:70:03:69:0C:49:C3:61:CE:B3:8D:
B8:7D:81:11:2A:AD:04:E8:96:17:B7:DB:82:AF:82:05:69:87:7C:44:
D4:0D:BE:2E:F2:D5:E7:F8:2E:7F:85:58:48:68:E5:B6:23:1C:EE:2F:
D2:62:A3:0E:F2:72:D8:C5:49:11:F3:BB:B4:7D:BB:F7:46:54:F9:BF:
C5:42:2E:EA:AA:3C:2D:5C:61:40:7C:39:3E:DB:FF:DC:04:0A:B2:EA:
AD:DD:6D:E5:7D:42:38:B5:BE:F1:02:88:E1:EB:54:81:75:42:58:BF:
1C:CF:F5:09:32:BF:14:E7:4A:32:01:C8:C5:A3:83:70:DF:D1:98:D6:
5A:59:18:38:9A:48:A0:A2:B6:A0:9C:A6:05:EB:C0:0E:08:0E:48:89:
93:20:CE:12:4C:B3:4E:82:D2:89:C5:59:D8:AE:B0:2B
parm: dbg_enabled:enable debug tracing (int)
parm: info_enabled:enable info tracing (int)
parm: use_persistent_mapping:use persistent mapping instead of traditional (non-persistent) mapping (int)

Any attempt to use it fails,
[root@illyad log]# gdrcopy_sanity
gdr_open error: Is gdrdrv driver installed and loaded?
gdr_open error: Is gdrdrv driver installed and loaded?

The module is most certainly loaded but there is no /dev/gdr*,
[root@illyad log]# lsmod | grep gdr
gdrdrv 24576 0
nvidia 103858176 8 nvidia_uvm,gdrdrv,nvidia_modeset
[root@illyad log]# ls /dev/gdr*
ls: cannot access '/dev/gdr*': No such file or directory

Nothing has deigned to lower itself to writing any sort of meaningful error output anywhere (dmesg, syslog, journald, messages, etc) which leaves me few options in providing more useful information here. SELinux is not enforcing which eliminates the most common suspect when it comes to mysterious file-[allegedly]-doesn't-exist problems.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions