Beyond Joule: Where SAP’s Generative AI Ends — and Predictive Intelligence Begins
- Walf Sun
- Oct 25
- 4 min read
Updated: Oct 26

When SAP Joule came out, it felt like the beginning of a new conversation between people and systems.For the first time, I could literally ask my SAP system things like:
“What’s our DSO this quarter?”“Which suppliers are behind on payments?”
and Joule would respond with live insights — right from SAP’s own data.
That’s powerful.But after working with it, I realized something: Joule tells you what happened.Finance teams also need to know what’s about to happen.
That’s where Predictive Invoice Intelligence (PII) comes in.
The Idea
I built PII to complement Joule — not compete with it.It adds a thin layer of machine learning that uses your current FI data (not archives, not data lakes) to predict which invoices might pay late and how that affects your cash flow.
It’s designed to be simple, transparent, and 100% SAP-compatible.In short — it gives Joule a predictive brain.
How It Works
The concept is straightforward:
ABAP extracts live invoice data from BKPF, BSEG, and BSAD.
Python (Flask) cleans, calculates, and predicts payment delay probabilities.
ABAP writes the results back into a Z-table in SAP.
Streamlit visualizes everything — or Joule can surface it conversationally.
It’s real-time foresight, right where your data already lives.
Architecture Overview
SAP S/4HANA or ECC
↓
ABAP Extractor (ZPII_RUN_PIPELINE)
↓ JSON via HTTP
pii_api.py → Flask service (features + prediction)
↓
ZINV_RISK_ANALYTICS (Z-table)
↓
Streamlit Dashboard (or Joule)
The best part?It’s modular — you can deploy it on-premise, in Azure, or right alongside your BTP services.
Step 1: The Python Intelligence Layer
Here’s the core of it — a small Flask API that does both feature engineering and prediction.Save this as pii_api.py and run it on your SAP-connected VM.
from flask import Flask, request, jsonify
import pandas as pd
import joblib, os
from sklearn.ensemble import GradientBoostingClassifier
app = Flask(__name__)
MODEL_PATH = "pii_model.pkl"
last_results = []
def ensure_model():
if os.path.exists(MODEL_PATH):
return
df = pd.DataFrame({
"amount":[1000,2000,3000,1500,8000,500],
"invoice_age":[10,30,60,5,90,2],
"days_to_clear":[12,35,70,7,120,3],
"vendor_risk":[0.1,0.2,0.8,0.05,0.9,0.02],
"overdue_flag":[0,0,1,0,1,0],
})
X = df[["amount","invoice_age","days_to_clear","vendor_risk"]]
y = df["overdue_flag"]
model = GradientBoostingClassifier().fit(X, y)
joblib.dump(model, MODEL_PATH)
ensure_model()
model = joblib.load(MODEL_PATH)
@app.route("/api/features", methods=["POST"])
def build_features():
df = pd.DataFrame(request.get_json())
df["docdate"] = pd.to_datetime(df["docdate"], errors="coerce")
df["cleardate"] = pd.to_datetime(df["cleardate"], errors="coerce")
df["amount"] = pd.to_numeric(df["amount"], errors="coerce").fillna(0.0)
df["payterm_days"] = pd.to_numeric(df["payterm"].astype(str).str.extract(r"(\d+)")[0], errors="coerce").fillna(30)
today = pd.Timestamp.today().normalize()
df["invoice_age"] = (today - df["docdate"]).dt.days.fillna(0)
df["days_to_clear"] = (df["cleardate"] - df["docdate"]).dt.days.fillna(0)
df["overdue_flag"] = (df["days_to_clear"] > df["payterm_days"]).astype(int)
df["vendor_risk"] = df.groupby("vendor")["overdue_flag"].transform("mean").fillna(0)
feats = df[["company","vendor","amount","invoice_age","days_to_clear","vendor_risk","overdue_flag"]]
return jsonify(feats.to_dict(orient="records"))
@app.route("/api/predict", methods=["POST"])
def predict():
global last_results, model
feats = pd.DataFrame(request.get_json())
X = feats[["amount","invoice_age","days_to_clear","vendor_risk"]]
feats["delay_probability"] = model.predict_proba(X)[:,1]
last_results = feats.to_dict(orient="records")
return jsonify(last_results)
@app.route("/api/results", methods=["GET"])
def results():
return jsonify(last_results)
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5001)
Step 2: ABAP — Extract, Predict, and Write Back
Now let’s look at the SAP side.This ABAP program connects directly to the Python API and pushes live FI data through it.
ZPII_RUN_PIPELINE
REPORT zpii_run_pipeline.
CONSTANTS: gc_url TYPE string VALUE 'http://localhost:5001'.
DATA: lt_invoices TYPE TABLE OF zpii_invoice,
lv_json TYPE string,
lv_features TYPE string,
lv_predictions TYPE string,
lt_predictions TYPE TABLE OF zpii_pred.
SELECT a~bukrs AS company,
a~lifnr AS vendor,
b~wrbtr AS amount,
a~budat AS docdate,
a~zterm AS payterm,
c~augdt AS cleardate
FROM bkpf AS a
INNER JOIN bseg AS b
ON a~bukrs = b~bukrs AND a~belnr = b~belnr AND a~gjahr = b~gjahr
LEFT JOIN bsad AS c
ON a~bukrs = c~bukrs AND a~belnr = c~belnr AND a~gjahr = c~gjahr
INTO CORRESPONDING FIELDS OF TABLE @lt_invoices
UP TO 1000 ROWS.
IF lt_invoices IS INITIAL.
WRITE: / 'No invoices found.'.
LEAVE PROGRAM.
ENDIF.
lv_json = /ui2/cl_json=>serialize( data = lt_invoices ).
" Send to feature builder
cl_http_client=>create_by_url( EXPORTING url = gc_url && '/api/features'
RECEIVING client = DATA(lo_http1) ).
lo_http1->request->set_header_field( name = 'Content-Type' value = 'application/json' ).
lo_http1->request->set_cdata( lv_json ).
lo_http1->send( ).
lo_http1->receive( ).
lv_features = lo_http1->response->get_cdata( ).
" Send to predictor
cl_http_client=>create_by_url( EXPORTING url = gc_url && '/api/predict'
RECEIVING client = DATA(lo_http2) ).
lo_http2->request->set_header_field( name = 'Content-Type' value = 'application/json' ).
lo_http2->request->set_cdata( lv_features ).
lo_http2->send( ).
lo_http2->receive( ).
lv_predictions = lo_http2->response->get_cdata( ).
" Write to Z-table
/ui2/cl_json=>deserialize( EXPORTING json = lv_predictions CHANGING data = lt_predictions ).
MODIFY zinv_risk_analytics FROM TABLE lt_predictions.
COMMIT WORK.
WRITE: / 'Pipeline complete:', lines( lt_predictions ), 'predictions stored.'.
The beauty of this program is its simplicity:no OData, no middleware — just direct JSON exchange between SAP and Python.
Z-table for Predictions
Create this in SE11 → ZINV_RISK_ANALYTICS
Field | Type | Description |
COMPANY | BUKRS | Company code |
VENDOR | LIFNR | Vendor |
AMOUNT | WRBTR | Invoice amount |
INVOICE_AGE | INT4 | Age in days |
DAYS_TO_CLEAR | INT4 | Days to payment |
VENDOR_RISK | DEC16_6 | Vendor risk factor |
OVERDUE_FLAG | INT1 | Binary overdue indicator |
DELAY_PROBABILITY | DEC16_6 | Model output probability |
Step 3: The Streamlit Dashboard
You can visualize the predictions directly from the API using Streamlit — a lightweight web app framework.
streamlit_app.py
import streamlit as st
import pandas as pd
import requests
API = "http://localhost:5001"
st.title("Predictive Invoice Intelligence — Live SAP Data")
if st.button("Check API Health"):
try:
r = requests.get(f"{API}/api/results", timeout=5)
st.success("API reachable.")
except Exception as e:
st.error(str(e))
if st.button("Load Latest Predictions"):
try:
r = requests.get(f"{API}/api/results", timeout=10)
if r.ok and r.json():
df = pd.DataFrame(r.json())
st.dataframe(df)
if {"vendor","delay_probability"}.issubset(df.columns):
st.bar_chart(df[["vendor","delay_probability"]].set_index("vendor"))
else:
st.info("No results yet. Run ZPII_RUN_PIPELINE in SAP first.")
except Exception as e:
st.error(str(e))
Step | What Happens |
1 | SAP extracts live invoice and payment data |
2 | Python cleans data and computes features |
3 | ML model predicts delay probabilities |
4 | Results are written back to SAP |
5 | Streamlit (or Joule) visualizes predictions |
What It Gives You
Proactive cash-flow insight — spot high-risk vendors early.
Smarter conversations with Joule — “Which invoices are likely to be late this month?”
No archive dependency — uses only current FI tables.
End-to-end transparency — you can see exactly how predictions are made.
Final Thought
SAP Joule gave us conversation.PII gives that conversation intuition.
It’s not about replacing SAP — it’s about extending it with the kind of predictive intelligence that finance has always needed.Built on real ABAP.Powered by Python.Designed for people who don’t just want to ask questions —but want to see the future inside their SAP system.



Comments