Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot load Custom Op file in the container LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

See original GitHub issue

Description

I am in the development phase of running Deep Learning Model on Triton Inference Server. I am using the LD_PRELOAD trick to load customs ops needed to support inference. But the libraries do not load up correct, and give the follwing error in the container logs

 priyankasaraf@priyank-ltmatct script % kubectl logs triton-54c965dcd9-tqkjx -c triton-server   
ERROR: ld.so: object '/triton/lib/_sentencepiece_tokenizer.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/triton/lib/_normalize_ops.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/triton/lib/_regex_split_ops.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/triton/lib/_wordpiece_tokenizer.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

Expected behavior: is a a successful inference. But the current response is : "Op type not registered 'CaseFoldUTF8' in binary running on triton-54c965dcd9-tqkjx. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) tf.contrib.resampler should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed."

Environment TensorRT Version: GPU Type: GPU 0: Tesla V100-SXM2-16GB

TRITON_SERVER_VERSION=2.15.0 NVIDIA_TRITON_SERVER_VERSION=21.10 NSIGHT_SYSTEMS_VERSION=2021.3.2.4 Triton Image: “21.10” CUDA Version: CUDA_VERSION=11.4.3.001 CUDA_DRIVER_VERSION=470.57.02 CUDNN Version: Operating System + Version: Distributor ID: Ubuntu, Description: Ubuntu 20.04.3 LTS Release: 20.04, Codename: focal Python Version (if applicable): python3.8

Are you using the Triton container or did you build it yourself? Using image 21.10

To Reproduce A deployment is done with one of the containers to run the triton-inference-server with the following agruments (triton-server container’s yaml file below.)

      containers:
        - name: triton-server
          image: "21.10"
          command: ["/bin/bash"]
          # About "backend-config": All backends are initialized; pytorch, tensorflow, openvino & onnxruntime. 
          # We are overriding Tensorflow version to be loaded by default to 2 (Rest of them will still load)
          # --backend-config=tensorflow,version=2
          # Ref: https://github.com/triton-inference-server/tensorflow_backend/blob/40f9d94ca1243de004c609cf9b056de19462d545/README.md
          args: ["-c",
                 "export LD_LIBRARY_PATH=/opt/tritonserver/backends/tensorflow2:$LD_LIBRARY_PATH
                 && export LD_PRELOAD="'/triton/lib/_sentencepiece_tokenizer.so /triton/lib/_normalize_ops.so 
                  /triton/lib/_regex_split_ops.so /triton/lib/_wordpiece_tokenizer.so'"
                 && tritonserver
                 --model-repository=/models/triton
                 --backend-config=tensorflow,version=2
                 --log-verbose=5
                 --log-info=true
                 --log-warning=true
                 --log-error=true
                 --http-port=8000
                 --grpc-port=8001
                 --metrics-port=8002
                 --model-control-mode=explicit
                 --grpc-use-ssl=false"
          ]
          volumeMounts:
            - mountPath: /models/triton
              name: models
            - mountPath: /triton/lib
              name: libraries
          imagePullPolicy: Always
          livenessProbe:
            failureThreshold: 5
            httpGet:
              path: /v2/health/live
              port: http
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 5
            successThreshold: 1
            timeoutSeconds: 1
          ports:
          - containerPort: 8000
            name: http
            protocol: TCP
          - containerPort: 8001
            name: grpc
            protocol: TCP
          - containerPort: 8002
            name: http-metrics
            protocol: TCP
          readinessProbe:
            successThreshold: 1
            failureThreshold: 3
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 1
            httpGet:
              path: /v2/health/live
              port: http
              scheme: HTTP
          resources:
            requests:
              cpu: 2
              memory: 12G
            limits:
              cpu: 3
              memory: 24G

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well). config.pbtxt:

name: "c8d9316a-1cfb-4b4b-aea3-3659e9dc5a17"
platform: "tensorflow_savedmodel"
input {
  name: "input_1"
  data_type: TYPE_STRING
  dims: [-1, 1]
}
output {
  name: "model_exporter"
  data_type: TYPE_FP32
  dims: [-1, 768]
}
instance_group {
  count: 1
}

Metadata.json

{
    "inputs": [
        {
            "datatype": "BYTES",
            "name": "input_1",
            "shape": [
                -1,
                1
            ]
        }
    ],
    "model_id": "SearchQnA",
    "model_version": "1",
    "outputs": [
        {
            "datatype": "FP32",
            "name": "model_exporter",
            "shape": [
                -1,
                768
            ]
        }
    ],
    "platform": "TENSORFLOW"
}

Issue Analytics

State:
Created a year ago
Comments:9 (6 by maintainers)

Top GitHub Comments

1reaction

PRIYANKArythem3commented, Jul 19, 2022

Thanks. I was able to find the root cause of the issue. I have two containers, the first one downloads the necessary binaries to a shared volume mount. The second container runs the triton-inference-server, which ideally should use those binaries from the shared volume mount. But since the containers spin-up in parallel, the triton-server container tries to load those binaries before the first container can download them. Due to this the second container throws an LD_PRELOAD error, but starts the triton-server anyway without the binaries.

0reactions

dyastremskycommented, Jul 19, 2022

Got it, that makes sense. Thanks for investigating and updating us!

Top Results From Across the Web

Document steps for how to compile Pytorch custom-ops library ...

Description In order to a custom-ops library for a PyTorch model that can ... cannot be preloaded (cannot open shared object file): ignored....

LD_PRELOAD problem on OKD4 - Dynatrace Community

When a contianer tries to ld preload iboneagentproc.so get the following error: ... cannot be preloaded (cannot open shared object file): ignored. oneagent....

Op type not registered 'CaseFoldUTF8' in binary - TensorRT

ERROR : ld.so: object '/triton/lib/_normalize_ops.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

[Q&A] [APP] Linux-on-Android project (Complete Linux Installer)

ERROR: ld.so: object '/vendor/lib/libNimsWrap.so' from LD_PRELOAD cannot be preloaded: ignored.

Cannot run program "ls": error=2, No such file or directory ...

I was not able to solve the problem so I found a second "workaround". Workarounds 1.Use 2 containers with different configuration.2 2.