# Verify Object Integrity Using Checksums in Stook

Checksum validation helps you ensure that data is stored and transferred without corruption.\
When you upload an object, Stook calculates a checksum and stores it as the object’s **ETag**. You can:

* Retrieve the ETag from Stook, and
* Optionally compute your own checksum locally (for example, using Python) and compare the values for **single-part uploads**.

{% hint style="info" %}
ETag is meaningful for comparing with a local MD5 only for **single-part uploads**.\
For multipart uploads, ETag will differ from the MD5 of the full file.
{% endhint %}

### Requirements

Make sure you have:

* A Stook bucket containing the object you want to verify
* Access Key and Secret Key
* One or more of the following tools installed:
  * AWS CLI
  * MinIO Client (`mc`)
  * Python with `boto3` (optional, for local checksum calculation)

### Retrieve the checksum (ETag) from Stook

You can retrieve the ETag value using AWS CLI, MinIO Client, or curl.

#### Use AWS CLI

```bash
aws s3api --endpoint-url https://<ENDPOINT_URL> \
  head-object \
  --bucket <BUCKET_NAME> \
  --key <OBJECT_KEY> \
  --profile <AWS_PROFILE>
```

The output includes the ETag:

```json
{
  "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
  ...
}
```

#### Use MinIO Client (mc)

```bash
mc stat <PROFILE_NAME>/<BUCKET_NAME>/<FILE_PATH> --json
```

Example output:

```json
{
  "etag": "d41d8cd98f00b204e9800998ecf8427e",
  ...
}
```

#### Use curl over the Stook endpoint

If you access the object directly via the Stook endpoint, you can inspect the `ETag` header:

```bash
curl -I https://<ENDPOINT_URL>/<BUCKET_NAME>/<FILE_PATH>
```

If the Stook bucket is connected to a CDN Resource, you can also send the request through the CDN URL. However, note that the ETag header may change when served via CDN, depending on caching and response headers.

### Understand ETag and multipart behavior

ETag behaves differently for single-part and multipart uploads:

* **Single-part uploads**
  * ETag typically represents the MD5 checksum of the entire object.
  * In this case, a locally computed MD5 of the file can match the ETag.
* **Multipart uploads**
  * ETag has the format:

    ```
    <md5-of-each-part-concatenated>-<number_of_parts>
    ```
  * This value **does not** match the MD5 checksum of the entire file.
  * If you see a mismatch between local MD5 and ETag, the object may have been uploaded as multipart.

{% hint style="info" %}
These checksum comparisons (local MD5 vs ETag) are valid only for **single-part uploads**.\
For multipart uploads, a mismatch between ETag and local MD5 is expected.
{% endhint %}

### Compute a local checksum in Python (Optionally)

Python examples are provided only to calculate **local checksums** on the client side.\
They **do not** calculate or change the checksum stored by Stook; they simply:

1. Download the object from Stook, and
2. Compute MD5 or SHA-256 locally so you can compare it with the ETag (for single-part uploads) or use it in your own integrity checks.

#### Example — Compute an MD5 checksum in Python

```python
import boto3
import hashlib

s3 = boto3.client(
    "s3",
    endpoint_url="https://endpoint_url",
    aws_access_key_id="access_key",
    aws_secret_access_key="secret_key",
)

response = s3.get_object(
    Bucket="bucket_name",
    Key="file_path",
)

md5 = hashlib.md5()

# Streaming read → do not load entire file into RAM
for chunk in response["Body"].iter_chunks(chunk_size=8192):
    if chunk:
        md5.update(chunk)

print("MD5:", md5.hexdigest())
```

#### Example — Compute a SHA-256 checksum in Python

```python
import boto3
import hashlib

s3 = boto3.client(
    "s3",
    endpoint_url="https://endpoint_url",
    aws_access_key_id="access_key",
    aws_secret_access_key="secret_key",
)

response = s3.get_object(
    Bucket="bucket_name",
    Key="file_path",
)

sha256 = hashlib.sha256()

# Streaming read → do not load entire file into RAM
for chunk in response["Body"].iter_chunks(chunk_size=8192):
    if chunk:
        sha256.update(chunk)

print("SHA-256:", sha256.hexdigest())
```

{% hint style="info" %}
These Python scripts compute **only local checksums**.\
They do not compute or retrieve any internal checksum other than what Stook already exposes via the `ETag` header.
{% endhint %}

### Troubleshoot checksum mismatches

If you notice a mismatch between a local checksum and the ETag from Stook:

* The object may have been uploaded as **multipart**, so the ETag will not match the MD5 of the full file.
* You may be using a different algorithm (for example, SHA-256 locally vs MD5-based ETag).
* You may be comparing against an outdated or changed local file.

Always verify:

* Upload method (single-part vs multipart)
* The algorithm used
* That you are comparing the correct object version

### Summary

* Stook calculates a checksum when an object is uploaded and exposes it via the **ETag** header.
* You can read ETag using **AWS CLI**, **MinIO Client**, or **curl**.
* ETag corresponds to an MD5-like checksum only for **single-part uploads**.
* For **multipart uploads**, ETag follows a special format and does not match the MD5 of the full file.
* Python examples compute **local MD5 or SHA-256** and are used only for client-side integrity validation, not for calculating Stook’s internal checksum.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://clients.medianova.com/products/object-storage-stook/integration-and-usage-guides/verify-object-integrity-using-checksums-in-stook.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
