.. RedShred API Client documentation master file, created by
   sphinx-quickstart on Fri Jun 28 10:18:46 2024.
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

Welcome to RedShred API Client's documentation!
===============================================


Basic Usage
-----------

Authenticating the client
^^^^^^^^^^^^^^^^^^^^^^^^^^^

There are a few different options for authentication; by explicitly specifying the server information,
using a RedShred configuration file, or using environmental variables. For the time being
we will only discus the first option, but you can rever to the API documentation for more information
on the others.

Here is how you would authenticate with explicit credentials:

.. code-block:: python
   :caption: Authentication with RedShred using explicit credentials
   :linenos:

   from redshred import RedShredClient
   client = RedShredClient(token="ae6daceef240103b2b9e9b562ff6784690cdd0fb", host="https://api.redshred.com"
   print(client.user.json(indent=2))

.. code-block:: json

   {
     "active": true,
     "email": "johndoe@theinternet.com",
     "first_name": "John",
     "joined": "2022-11-15T15:40:56.928582+00:00",
     "last_login": "2023-01-24T15:59:24.019915+00:00",
     "last_name": "Doe",
     "username": "johndoe@theinternet.com"
   }

Congratulations, you are now authenticated!


Creating Collections and Documents
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In order to process and upload documents, we need a collection to store them in.

.. code-block:: python
   :caption: creating a collection called "my-collection"
   :linenos:

   from redshred import Collection

   # Create the collection object locally
   collection = Collection(slug="my-collection")

   # associate it with our client and create it remotely
   collection.create(client)

   print(collection.yaml())

.. code-block::  yaml
   :caption: output

   config: {}
   created_at: '2023-01-24T19:44:27.874546'
   created_by: johndoe@theinternet.com
   description: null
   documents_link: https://api.dev.redshred.com/v2/collections/my-collection/documents
   id: ETiN7a7EAsNYb2CwX4zkvY
   marked_for_delete: null
   metadata: null
   name: null
   owner: johndoe@theinternet.com
   perspectives_link: https://api.dev.redshred.com/v2/collections/my-collection/perspectives
   segments_link: https://api.dev.redshred.com/v2/collections/my-collection/segments
   self_link: https://api.dev.redshred.com/v2/collections/my-collection
   slug: my-collection
   updated_at: '2023-01-24T19:44:27.874563'
   updated_by: johndoe@theinternet.com
   user_data: {}

Once we have created a collection we will want to upload a document:

.. code-block:: python
   :caption: uploading a document from a local file "my-document.pdf"
   :linenos:

   # First we can retrieve the collection we created above using similar syntax as creating it
   collection = client.collection("my-collection")
   document = collection.upload_file("/home/johndoe/Documents/my-document.pdf")

   print(document.json(indent=2)))

.. code-block:: json
   :caption: output

   {
     "self_link": "https://api.dev.redshred.com/v2/collections/my-collection/documents/aE9J62ZR7RAhVYaCeSwftJ",
     "id": "aE9J62ZR7RAhVYaCeSwftJ",
     "collection_link": "https://api.dev.redshred.com/v2/collections/my-collection",
     "collection_slug": "my-collection",
     "config": null,
     "content_hash": "22132ba64ec6bf79eabbba1b57ce9c8d8663bbc0e1252f17032f79b693d2edfa",
     "created_at": "2023-01-24T19:50:26.839768+00:00",
     "created_by": "johndoe@theinternet.com",
     "csv_metadata": null,
     "description": null,
     "document_segment_link": null,
     "errors": null,
     "file_link": "https://api.dev.redshred.com/v2/files/my-collection/s22132ba64ec6bf79eabbba1b57ce9c8d8663bbc0e1252f17032f79b693d2edfa.pdf?name=my-document.pdf",
     "file_size": 21367,
     "index": 1,
     "metadata": null,
     "n_pages": null,
     "name": "my-document.pdf",
     "original_name": "my-document.pdf",
     "pages_link": "https://api.dev.redshred.com/v2/collections/my-collection/documents/aE9J62ZR7RAhVYaCeSwftJ/pages",
     "pdf_link": "https://api.dev.redshred.com/v2/files/my-collection/s22132ba64ec6bf79eabbba1b57ce9c8d8663bbc0e1252f17032f79b693d2edfa.pdf",
     "perspectives_link": "https://api.dev.redshred.com/v2/collections/my-collection/documents/aE9J62ZR7RAhVYaCeSwftJ/perspectives",
     "read_state": "queued",
     "read_state_updated_at": "2023-01-24T19:50:26.961375+00:00",
     "region": {
       "coordinates": [
         [
           [0.0, 0.0],
           [1.0, 0.0],
           [1.0, 1.0],
           [0.0, 1.0],
           [0.0, 0.0]
         ]
       ],
       "type": "Polygon"
     },
     "segments_link": "https://api.dev.redshred.com/v2/collections/my-collection/documents/aE9J62ZR7RAhVYaCeSwftJ/segments",
     "slug": "my-documentpdf",
     "source": "file",
     "summary": null,
     "text": null,
     "updated_at": "2023-01-24T19:50:26.894566+00:00",
     "updated_by": "johndoe@theinternet.com",
     "user_data": null,
     "warnings": null,
   }

We notice above that most of the information we would expect to see, like `text` or `n_pages` is empty. This
is because the document is still reading! We can wait for the document to read and see what's different from above
like so:

.. code-block:: python
   :caption: waiting until the document is read
   :linenos:

   document.wait_until_read() # the interpreter will pause here until the document has finished reading remotely
   print(document.yaml(include={"text", "n_pages"}))

.. code-block:: yaml
   :caption: the output with only the specified fields displayed

   n_pages: 2
   text: The rain in Spain stays mainly in the plain...


Accessing and Interacting with API Objects
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: python
   :caption: Printing all paragraphs in a document a few different ways
   :linenos:

   for collection in client.collections():
      print(f"{collection.slug!r} created at {collection.created_at} by {collection.created_by}")
      if collection.slug == "my-collection":
         print("my-collection found!")
         break

   # First we find our document. Since we know we only have one with the title "my-document.pdf", we can just
   # just get the first search result from querying the server via its attributes
   document = collection.documents(name="my-document.pdf").first()

   # we can then do a similar operation to find our text perspective, called "typography" and all of its paragraph
   # segments
   typography_perspective = document.perspectives(name="typography").first()
   for segment in typography_perspective.segments(segment_type="paragraph"):
      print(f"paragraph id: {segment.id}")
      print(segment.text)


.. code-block:: text
   :caption: output

   'my-other-collection' created at 2023-01-24 19:44:27.874546+00:00 by johndoe@theinternet.com
   'my-collection' created at 2023-01-11 17:37:03.342783+00:00 by johndoe@theinternet.com
   my-collection found!

   paragraph id: WmXRmpJVseMgdGL5TbM8b6
   The rain in Spain stays mainly in the plain


RSQL Call Query Creation
^^^^^^^^^^^^^^^^^^^^^^^^

The Redshred API supports RSQL (Redshred Query Language) for advanced querying. The client library provides
a safe way to construct these queries using string templates and automatic escaping of special characters.

Basic Query
"""""""""""

.. code-block:: python
   :caption: For simple queries, you can pass a string directly::
   :linenos:

    rs = RedShredClient()
    collections = rs.collections(q='text="rose" and segment_type="paragraph"')


Template with Values
""""""""""""""""""""

.. code-block:: python
   :caption: For more complex queries, use a list where the first element is a template string and subsequent elements are values::
   :linenos:
    from redshred import RedShredClient

    rs = RedShredClient()

    # Using positional arguments
    collections = rs.collections(q=['name == "{}"', "John Smith"])
    collections = rs.collections(q=['created_at >= {} and created_at <= {}', "2024-01-01", "2024-12-31"])

    # Using keyword arguments allows you to reorder or reuse values multiple times.
    collections = rs.collections(q=["age >= {min} and age <= {max}", {"min": 25, "max": 35}])
    collections = rs.collections(q=['status = "{status}"', {"status": "active"}])

    # Mixed positional and keyword arguments
    collections = rs.collections(q=['text = "{}" and age >= {min}', "John Smith", {"min": 25}])

Special Character Handling
""""""""""""""""""""""""""

.. code-block:: python
   :caption: The client can automatically handles escaping of special characters in string values (\\, ', ", \\a, \\b, \\f, \\n, \\r, \\t, \\v) if you use string formatting.  For example::
   :linenos:
    from redshred import RedShredClient

    # Initialize the client
    rs = RedShredClient()

    # Line breaks in strings are properly escaped
    collections = rs.collections(q=['text = "{}"', "first line\nsecond line"])

    # Quote marks are escaped
    collections = rs.collections(q=['document_name = "{}"', "O'Brien"])
    collections = rs.collections(q=['description = "{}"', 'Contains "quoted" text'])

    # Backslash handling
    # Using template strings - the client handles escaping automatically
    collections = rs.collections(q=['text = "{name}"', {"name": "back\\slash"}])

Common Use Cases
""""""""""""""""

.. code-block:: python
   :caption: Here are some practical examples of RSQL queries::
   :linenos:
    from redshred import RedShredClient

    # Initialize the client
    rs = RedShredClient()

    # Date range filtering
    collections = rs.collections(
        q=['created_at >= "{}" and created_at <= "{}"', "2024-01-01", "2024-12-31"]
    )

    # Status and type filtering
    collections = rs.collections(
        q=['status="{}" and segment_type="{}"', "active", "paragraph"]
    )

    # Text search with multiple conditions
    collections = rs.collections(
        q=['text="{}" and (segment_type="{}" or segment_type="{}")', "Report", "pdf", "doc"]
    )

    # Numeric range with custom field
    collections = rs.collections(
        q=["page_count >= {min} and confidence > {threshold}",
           {"min": 10, "threshold": 0.85}]
    )

API Documentation
-----------------

.. toctree::
   :maxdepth: 8
   :caption: Contents:

   redshred


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`