top of page
Search

Beyond Joule: Where SAP’s Generative AI Ends — and Predictive Intelligence Begins

Updated: Oct 26

ree

When SAP Joule came out, it felt like the beginning of a new conversation between people and systems.For the first time, I could literally ask my SAP system things like:

“What’s our DSO this quarter?”“Which suppliers are behind on payments?”

and Joule would respond with live insights — right from SAP’s own data.

That’s powerful.But after working with it, I realized something: Joule tells you what happened.Finance teams also need to know what’s about to happen.

That’s where Predictive Invoice Intelligence (PII) comes in.


The Idea

I built PII to complement Joule — not compete with it.It adds a thin layer of machine learning that uses your current FI data (not archives, not data lakes) to predict which invoices might pay late and how that affects your cash flow.

It’s designed to be simple, transparent, and 100% SAP-compatible.In short — it gives Joule a predictive brain.


How It Works

The concept is straightforward:

  1. ABAP extracts live invoice data from BKPF, BSEG, and BSAD.

  2. Python (Flask) cleans, calculates, and predicts payment delay probabilities.

  3. ABAP writes the results back into a Z-table in SAP.

  4. Streamlit visualizes everything — or Joule can surface it conversationally.

It’s real-time foresight, right where your data already lives.


Architecture Overview

SAP S/4HANA or ECC
    ↓
ABAP Extractor (ZPII_RUN_PIPELINE)
    ↓  JSON via HTTP
pii_api.py  → Flask service (features + prediction)
    ↓
ZINV_RISK_ANALYTICS (Z-table)
    ↓
Streamlit Dashboard (or Joule)

The best part?It’s modular — you can deploy it on-premise, in Azure, or right alongside your BTP services.


Step 1: The Python Intelligence Layer

Here’s the core of it — a small Flask API that does both feature engineering and prediction.Save this as pii_api.py and run it on your SAP-connected VM.

from flask import Flask, request, jsonify
import pandas as pd
import joblib, os
from sklearn.ensemble import GradientBoostingClassifier

app = Flask(__name__)
MODEL_PATH = "pii_model.pkl"
last_results = []

def ensure_model():
    if os.path.exists(MODEL_PATH):
        return
    df = pd.DataFrame({
        "amount":[1000,2000,3000,1500,8000,500],
        "invoice_age":[10,30,60,5,90,2],
        "days_to_clear":[12,35,70,7,120,3],
        "vendor_risk":[0.1,0.2,0.8,0.05,0.9,0.02],
        "overdue_flag":[0,0,1,0,1,0],
    })
    X = df[["amount","invoice_age","days_to_clear","vendor_risk"]]
    y = df["overdue_flag"]
    model = GradientBoostingClassifier().fit(X, y)
    joblib.dump(model, MODEL_PATH)

ensure_model()
model = joblib.load(MODEL_PATH)

@app.route("/api/features", methods=["POST"])
def build_features():
    df = pd.DataFrame(request.get_json())
    df["docdate"] = pd.to_datetime(df["docdate"], errors="coerce")
    df["cleardate"] = pd.to_datetime(df["cleardate"], errors="coerce")
    df["amount"] = pd.to_numeric(df["amount"], errors="coerce").fillna(0.0)
    df["payterm_days"] = pd.to_numeric(df["payterm"].astype(str).str.extract(r"(\d+)")[0], errors="coerce").fillna(30)
    today = pd.Timestamp.today().normalize()
    df["invoice_age"] = (today - df["docdate"]).dt.days.fillna(0)
    df["days_to_clear"] = (df["cleardate"] - df["docdate"]).dt.days.fillna(0)
    df["overdue_flag"] = (df["days_to_clear"] > df["payterm_days"]).astype(int)
    df["vendor_risk"] = df.groupby("vendor")["overdue_flag"].transform("mean").fillna(0)
    feats = df[["company","vendor","amount","invoice_age","days_to_clear","vendor_risk","overdue_flag"]]
    return jsonify(feats.to_dict(orient="records"))

@app.route("/api/predict", methods=["POST"])
def predict():
    global last_results, model
    feats = pd.DataFrame(request.get_json())
    X = feats[["amount","invoice_age","days_to_clear","vendor_risk"]]
    feats["delay_probability"] = model.predict_proba(X)[:,1]
    last_results = feats.to_dict(orient="records")
    return jsonify(last_results)

@app.route("/api/results", methods=["GET"])
def results():
    return jsonify(last_results)

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)

Step 2: ABAP — Extract, Predict, and Write Back

Now let’s look at the SAP side.This ABAP program connects directly to the Python API and pushes live FI data through it.

ZPII_RUN_PIPELINE

REPORT zpii_run_pipeline.

CONSTANTS: gc_url TYPE string VALUE 'http://localhost:5001'.

DATA: lt_invoices TYPE TABLE OF zpii_invoice,
      lv_json TYPE string,
      lv_features TYPE string,
      lv_predictions TYPE string,
      lt_predictions TYPE TABLE OF zpii_pred.

SELECT a~bukrs AS company,
       a~lifnr AS vendor,
       b~wrbtr AS amount,
       a~budat AS docdate,
       a~zterm AS payterm,
       c~augdt AS cleardate
  FROM bkpf AS a
  INNER JOIN bseg AS b
    ON a~bukrs = b~bukrs AND a~belnr = b~belnr AND a~gjahr = b~gjahr
  LEFT JOIN bsad AS c
    ON a~bukrs = c~bukrs AND a~belnr = c~belnr AND a~gjahr = c~gjahr
  INTO CORRESPONDING FIELDS OF TABLE @lt_invoices
  UP TO 1000 ROWS.

IF lt_invoices IS INITIAL.
  WRITE: / 'No invoices found.'.
  LEAVE PROGRAM.
ENDIF.

lv_json = /ui2/cl_json=>serialize( data = lt_invoices ).

" Send to feature builder
cl_http_client=>create_by_url( EXPORTING url = gc_url && '/api/features'
                               RECEIVING client = DATA(lo_http1) ).
lo_http1->request->set_header_field( name = 'Content-Type' value = 'application/json' ).
lo_http1->request->set_cdata( lv_json ).
lo_http1->send( ).
lo_http1->receive( ).
lv_features = lo_http1->response->get_cdata( ).

" Send to predictor
cl_http_client=>create_by_url( EXPORTING url = gc_url && '/api/predict'
                               RECEIVING client = DATA(lo_http2) ).
lo_http2->request->set_header_field( name = 'Content-Type' value = 'application/json' ).
lo_http2->request->set_cdata( lv_features ).
lo_http2->send( ).
lo_http2->receive( ).
lv_predictions = lo_http2->response->get_cdata( ).

" Write to Z-table
/ui2/cl_json=>deserialize( EXPORTING json = lv_predictions CHANGING data = lt_predictions ).
MODIFY zinv_risk_analytics FROM TABLE lt_predictions.
COMMIT WORK.

WRITE: / 'Pipeline complete:', lines( lt_predictions ), 'predictions stored.'.

The beauty of this program is its simplicity:no OData, no middleware — just direct JSON exchange between SAP and Python.

Z-table for Predictions

Create this in SE11 → ZINV_RISK_ANALYTICS

Field

Type

Description

COMPANY

BUKRS

Company code

VENDOR

LIFNR

Vendor

AMOUNT

WRBTR

Invoice amount

INVOICE_AGE

INT4

Age in days

DAYS_TO_CLEAR

INT4

Days to payment

VENDOR_RISK

DEC16_6

Vendor risk factor

OVERDUE_FLAG

INT1

Binary overdue indicator

DELAY_PROBABILITY

DEC16_6

Model output probability

Step 3: The Streamlit Dashboard

You can visualize the predictions directly from the API using Streamlit — a lightweight web app framework.

streamlit_app.py

import streamlit as st
import pandas as pd
import requests

API = "http://localhost:5001"

st.title("Predictive Invoice Intelligence — Live SAP Data")

if st.button("Check API Health"):
    try:
        r = requests.get(f"{API}/api/results", timeout=5)
        st.success("API reachable.")
    except Exception as e:
        st.error(str(e))

if st.button("Load Latest Predictions"):
    try:
        r = requests.get(f"{API}/api/results", timeout=10)
        if r.ok and r.json():
            df = pd.DataFrame(r.json())
            st.dataframe(df)
            if {"vendor","delay_probability"}.issubset(df.columns):
                st.bar_chart(df[["vendor","delay_probability"]].set_index("vendor"))
        else:
            st.info("No results yet. Run ZPII_RUN_PIPELINE in SAP first.")
    except Exception as e:
        st.error(str(e))

Step

What Happens

1

SAP extracts live invoice and payment data

2

Python cleans data and computes features

3

ML model predicts delay probabilities

4

Results are written back to SAP

5

Streamlit (or Joule) visualizes predictions


What It Gives You

  • Proactive cash-flow insight — spot high-risk vendors early.

  • Smarter conversations with Joule — “Which invoices are likely to be late this month?”

  • No archive dependency — uses only current FI tables.

  • End-to-end transparency — you can see exactly how predictions are made.


Final Thought

SAP Joule gave us conversation.PII gives that conversation intuition.

It’s not about replacing SAP — it’s about extending it with the kind of predictive intelligence that finance has always needed.Built on real ABAP.Powered by Python.Designed for people who don’t just want to ask questions —but want to see the future inside their SAP system.

 
 
 

Comments


Featured Blog Post

bottom of page