Beyond Joule: Predictive Foresight with AIS-CRF™
- Walf Sun
- Oct 27
- 4 min read

When SAP Joule arrived, it gave enterprise users something that had been missing for years — a natural conversation with SAP.
For the first time, finance teams could simply ask:
“What’s our DSO this quarter?”
“Which customers are behind on payments?”
and Joule would respond — instantly, contextually, and conversationally — using real SAP data. That’s a leap forward.But after working with Joule, I began to notice something deeper.
Joule can tell you what happened. But it doesn’t tell you what’s about to happen.
That’s where AIS-CRF™ (Customer Risk Forecaster) comes in.
The Next Layer: From Descriptive to Predictive
AIS-CRF™ is part of my Archive Intelligence Suite (AIS) — a series of AI-driven extensions that bring foresight into SAP data.While Joule gives SAP a voice, AIS-CRF™ gives it vision.
The system analyzes SAP SD, CO, and FI data to forecast:
Which customers are likely to delay or default on payments, and
Which customers are showing early signs of churn or declining engagement.
This isn’t just another analytics dashboard — it’s a predictive lens built with Python, Streamlit, and ABAP integration.
Joule + AIS-CRF™: Talk Meets Foresight
Here’s how they work together in real time:
Inside SAP Joule, a credit manager asks:
“Show me customers likely to delay payment this month.”
Joule calls the AIS-CRF™ REST API — a lightweight Python microservice hosted on Azure or BTP.
AIS-CRF™ runs a trained model on live SAP data and returns something like:
[ {"customer_id": "C1045", "credit_risk": 0.86, "recommended_action": "Require prepayment"}, {"customer_id": "C1189", "credit_risk": 0.79, "recommended_action": "Shorten terms to Net15"} ]
Joule summarizes:
“Two customers show elevated credit risk — C1045 should move to prepayment, and C1189 needs shorter terms.”
The finance team clicks View Details, opening the Streamlit AIS-CRF dashboard, which shows visual trends — margins, late payments, AR exposure — and actionable recommendations.
That’s how a conversation turns into predictive action.
Architecture at a Glance
SAP SD/CO/FI → ABAP (HTTP POST) → AIS-CRF™ REST API (Python)
↓
Streamlit Dashboard (Visuals)
↓
Joule Conversational Layer
This approach doesn’t modify SAP logic.It extends it — cleanly, securely, and intelligently.
Inside the AIS-CRF™ Engine
AIS-CRF™ is built with open-source tools:
Python for data processing and ML (pandas, scikit-learn)
Streamlit for UI and REST serving
ABAP for integration back into SAP
Let’s go through the working components.
1. Data Generator (Simulating SAP Data)
# sample_data/generate_mock_data.py
import os, numpy as np, pandas as pd
from datetime import datetime, timedelta
rng = np.random.default_rng(42)
os.makedirs("data", exist_ok=True)
n_customers = 400
industries = ['Manufacturing','Retail','Utilities','Healthcare','Tech']
regions = ['NA','EU','APJ']
customers = pd.DataFrame({
'customer_id': [f'C{1000+i}' for i in range(n_customers)],
'industry': rng.choice(industries, n_customers),
'region': rng.choice(regions, n_customers),
'credit_limit': rng.integers(20000, 250000, n_customers),
'acct_open_days': rng.integers(90, 3000, n_customers)
})
today = pd.Timestamp.today().normalize()
start = today - pd.Timedelta(days=540)
def random_date(start, end, size):
delta = (end - start).days
return [start + pd.Timedelta(days=int(rng.integers(0, delta))) for _ in range(size)]
sales_rows, invoice_rows = [], []
for cid in customers['customer_id']:
for d in random_date(start, today, int(rng.integers(5, 60))):
net = max(100, float(rng.normal(5000, 3000)))
margin = float(np.clip(rng.normal(0.28, 0.12), -0.1, 0.8))
sales_rows.append([f'SO{rng.integers(10**6,10**7)}', cid, d, round(net,2), round(margin,4)])
sales = pd.DataFrame(sales_rows, columns=['so_id','customer_id','so_date','net_value','gross_margin_pct'])
for cid in customers['customer_id']:
for inv_d in random_date(start, today, int(rng.integers(5, 40))):
inv_amt = max(100, float(rng.normal(6000, 3500)))
due = inv_d + pd.Timedelta(days=int(rng.choice([15,30,45,60])))
delay = int(np.round(rng.normal(5, 15)))
pay_date = due + pd.Timedelta(days=delay)
if rng.random() < 0.05: pay_date = pd.NaT
invoice_rows.append([f'INV{rng.integers(10**6,10**7)}', cid, inv_d, due, pay_date, inv_amt])
invoices = pd.DataFrame(invoice_rows, columns=['invoice_id','customer_id','invoice_date','due_date','payment_date','invoice_amount'])
customers.to_csv("data/customers.csv", index=False)
sales.to_csv("data/sales.csv", index=False)
invoices.to_csv("data/invoices.csv", index=False)
2. Feature Engineering
# src/data_prep.py
import pandas as pd, numpy as np
def _days_between(a, b):
return (pd.to_datetime(b) - pd.to_datetime(a)).dt.days
def build_features(customers, sales, invoices, snapshot_date=None):
snapshot_date = pd.Timestamp.today().normalize() if snapshot_date is None else pd.to_datetime(snapshot_date)
sales['so_date'] = pd.to_datetime(sales['so_date'])
invoices['invoice_date'] = pd.to_datetime(invoices['invoice_date'])
invoices['due_date'] = pd.to_datetime(invoices['due_date'])
invoices['payment_date'] = pd.to_datetime(invoices['payment_date'], errors='coerce')
def rolling_sales(days):
min_date = snapshot_date - pd.Timedelta(days=days)
s = sales[sales['so_date'] >= min_date]
g = s.groupby('customer_id').agg(
sales_count=('so_date','count'),
revenue=('net_value','sum'),
avg_margin=('gross_margin_pct','mean')
)
g.columns = [f'{c}_{days}d' for c in g.columns]
return g
s12, s3 = rolling_sales(365), rolling_sales(90)
inv = invoices.copy()
inv['paid'] = ~inv['payment_date'].isna()
inv['days_late'] = np.where(inv['paid'],
_days_between(inv['due_date'], inv['payment_date']),
_days_between(inv['due_date'], snapshot_date))
inv['days_late'] = inv['days_late'].clip(lower=-60, upper=180)
ginv = inv.groupby('customer_id').agg(
invoices_total=('invoice_id','count'),
invoices_unpaid=('paid', lambda x: (~x).sum()),
late_ratio=('days_late', lambda s: (s>0).mean()),
avg_days_late=('days_late','mean'),
outstanding_amt=('invoice_amount', lambda a: a[inv.loc[a.index, 'paid']==False].sum())
)
feats = customers.set_index('customer_id').join([s12, s3, ginv])
feats.fillna(0, inplace=True)
feats = pd.get_dummies(feats, columns=['industry','region'], drop_first=True)
feats['label_credit_event'] = ((feats['invoices_unpaid']>0)|(feats['avg_days_late']>10)).astype(int)
feats['label_churn'] = ((feats['revenue_365d']>0)&(feats['revenue_90d']<feats['revenue_365d']*0.05)).astype(int)
return feats.reset_index(), snapshot_date
3. Model Training
# train_model.py
import os, joblib, pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
from src.data_prep import build_features
customers=pd.read_csv('data/customers.csv')
sales=pd.read_csv('data/sales.csv')
invoices=pd.read_csv('data/invoices.csv')
feats,_=build_features(customers,sales,invoices)
os.makedirs('models',exist_ok=True)
def train_model(ycol,name):
X=feats.drop(columns=['customer_id','label_credit_event','label_churn'],errors='ignore').fillna(0)
y=feats[ycol]
Xtr,Xte,ytr,yte=train_test_split(X,y,test_size=0.3,random_state=42,stratify=y)
clf=RandomForestClassifier(n_estimators=300,class_weight='balanced',random_state=42)
clf.fit(Xtr,ytr)
print(f'{name} AUC:',roc_auc_score(yte,clf.predict_proba(Xte)[:,1]))
joblib.dump({'model':clf,'columns':list(X.columns)},f'models/{name}.pkl')
train_model('label_credit_event','credit_risk')
train_model('label_churn','churn_risk')
4. Streamlit App (Dashboard + REST API)
# app.py
import os, pandas as pd, joblib, streamlit as st
from src.data_prep import build_features
st.title("AIS-CRF™ — Customer Credit & Churn Risk Forecaster")
data_dir='data'
customers=pd.read_csv(f'{data_dir}/customers.csv')
sales=pd.read_csv(f'{data_dir}/sales.csv')
invoices=pd.read_csv(f'{data_dir}/invoices.csv')
snapshot=st.date_input('Snapshot date',pd.Timestamp.today().date())
feats,_=build_features(customers,sales,invoices,snapshot)
models={'credit_risk':joblib.load('models/credit_risk.pkl'),
'churn_risk':joblib.load('models/churn_risk.pkl')}
X=feats.drop(columns=['customer_id','label_credit_event','label_churn'],errors='ignore').fillna(0)
for name,pack in models.items():
for c in pack['columns']:
if c not in X.columns: X[c]=0
X=X[pack['columns']]
feats[f'{name}_score']=pack['model'].predict_proba(X)[:,1]
feats['recommended_action']=feats['credit_risk_score'].apply(
lambda x: 'Require prepayment' if x>0.7 else 'Offer early-pay discount' if x>0.4 else 'Maintain normal terms')
st.dataframe(feats[['customer_id','credit_risk_score','churn_risk_score','recommended_action']].sort_values('credit_risk_score',ascending=False))
st.markdown("**Integrate this API with SAP Joule or ABAP via REST for predictive foresight.**")
ABAP Integration: Bringing It into SAP
Here’s how SAP calls the model through a REST API — directly from FD32, FD33, or a Z-report:
DATA: lv_url TYPE string VALUE 'https://ais-crf.walfsun.com/api/predict',
lv_json TYPE string,
lv_response TYPE string,
lo_http TYPE REF TO if_http_client.
lv_json = |{{ "customer_id": "{ p_kunnr }", "snapshot_date": "{ sy-datum }" }}|.
cl_http_client=>create_by_url(
EXPORTING
url = lv_url
IMPORTING
client = lo_http
).
lo_http->request->set_header_field( name = 'Content-Type' value = 'application/json' ).
lo_http->request->set_method( 'POST' ).
lo_http->request->set_cdata( lv_json ).
lo_http->send( ).
lo_http->receive( ).
lv_response = lo_http->response->get_cdata( ).
DATA(lo_json) = /ui2/cl_json=>deserialize( json = lv_response ).
WRITE: / 'Credit Risk:', lo_json->get( 'credit_risk' ),
/ 'Churn Risk:', lo_json->get( 'churn_risk' ),
/ 'Action:', lo_json->get( 'recommended_action' ).
What it does:
Sends customer data to the Python API
Receives risk scores and recommended actions
Displays them in SAP — or stores in a Z-table (e.g., ZCUST_RISK_SCORE)
From here, Joule can pick up those same values as part of its natural-language responses.
Business Value
Finance gains control — not just insight.
Closing Thought
Joule describes. AIS-CRF™ predicts.Together, they give SAP both a voice and foresight.
Because real intelligence isn’t about describing data —It’s about anticipating what’s next.



Comments