Beyond Joule: Predictive Foresight with AIS-CRF™

Walf Sun
Oct 27
4 min read

When SAP Joule arrived, it gave enterprise users something that had been missing for years — a natural conversation with SAP.

For the first time, finance teams could simply ask:

“What’s our DSO this quarter?”

“Which customers are behind on payments?”

and Joule would respond — instantly, contextually, and conversationally — using real SAP data. That’s a leap forward.But after working with Joule, I began to notice something deeper.

Joule can tell you what happened. But it doesn’t tell you what’s about to happen.

That’s where AIS-CRF™ (Customer Risk Forecaster) comes in.

The Next Layer: From Descriptive to Predictive

AIS-CRF™ is part of my Archive Intelligence Suite (AIS) — a series of AI-driven extensions that bring foresight into SAP data.While Joule gives SAP a voice, AIS-CRF™ gives it vision.

The system analyzes SAP SD, CO, and FI data to forecast:

Which customers are likely to delay or default on payments, and
Which customers are showing early signs of churn or declining engagement.

This isn’t just another analytics dashboard — it’s a predictive lens built with Python, Streamlit, and ABAP integration.

Joule + AIS-CRF™: Talk Meets Foresight

Here’s how they work together in real time:

Inside SAP Joule, a credit manager asks:
“Show me customers likely to delay payment this month.”
Joule calls the AIS-CRF™ REST API — a lightweight Python microservice hosted on Azure or BTP.
AIS-CRF™ runs a trained model on live SAP data and returns something like:
[ {"customer_id": "C1045", "credit_risk": 0.86, "recommended_action": "Require prepayment"}, {"customer_id": "C1189", "credit_risk": 0.79, "recommended_action": "Shorten terms to Net15"} ]
Joule summarizes:
“Two customers show elevated credit risk — C1045 should move to prepayment, and C1189 needs shorter terms.”
The finance team clicks View Details, opening the Streamlit AIS-CRF dashboard, which shows visual trends — margins, late payments, AR exposure — and actionable recommendations.

That’s how a conversation turns into predictive action.

Architecture at a Glance

SAP SD/CO/FI → ABAP (HTTP POST) → AIS-CRF™ REST API (Python)
                          ↓
                 Streamlit Dashboard (Visuals)
                          ↓
                     Joule Conversational Layer

This approach doesn’t modify SAP logic.It extends it — cleanly, securely, and intelligently.

Inside the AIS-CRF™ Engine

AIS-CRF™ is built with open-source tools:

Python for data processing and ML (pandas, scikit-learn)
Streamlit for UI and REST serving
ABAP for integration back into SAP

Let’s go through the working components.

1. Data Generator (Simulating SAP Data)

# sample_data/generate_mock_data.py
import os, numpy as np, pandas as pd
from datetime import datetime, timedelta
rng = np.random.default_rng(42)

os.makedirs("data", exist_ok=True)
n_customers = 400
industries = ['Manufacturing','Retail','Utilities','Healthcare','Tech']
regions = ['NA','EU','APJ']

customers = pd.DataFrame({
    'customer_id': [f'C{1000+i}' for i in range(n_customers)],
    'industry': rng.choice(industries, n_customers),
    'region': rng.choice(regions, n_customers),
    'credit_limit': rng.integers(20000, 250000, n_customers),
    'acct_open_days': rng.integers(90, 3000, n_customers)
})

today = pd.Timestamp.today().normalize()
start = today - pd.Timedelta(days=540)

def random_date(start, end, size):
    delta = (end - start).days
    return [start + pd.Timedelta(days=int(rng.integers(0, delta))) for _ in range(size)]

sales_rows, invoice_rows = [], []
for cid in customers['customer_id']:
    for d in random_date(start, today, int(rng.integers(5, 60))):
        net = max(100, float(rng.normal(5000, 3000)))
        margin = float(np.clip(rng.normal(0.28, 0.12), -0.1, 0.8))
        sales_rows.append([f'SO{rng.integers(10**6,10**7)}', cid, d, round(net,2), round(margin,4)])

sales = pd.DataFrame(sales_rows, columns=['so_id','customer_id','so_date','net_value','gross_margin_pct'])

for cid in customers['customer_id']:
    for inv_d in random_date(start, today, int(rng.integers(5, 40))):
        inv_amt = max(100, float(rng.normal(6000, 3500)))
        due = inv_d + pd.Timedelta(days=int(rng.choice([15,30,45,60])))
        delay = int(np.round(rng.normal(5, 15)))
        pay_date = due + pd.Timedelta(days=delay)
        if rng.random() < 0.05: pay_date = pd.NaT
        invoice_rows.append([f'INV{rng.integers(10**6,10**7)}', cid, inv_d, due, pay_date, inv_amt])

invoices = pd.DataFrame(invoice_rows, columns=['invoice_id','customer_id','invoice_date','due_date','payment_date','invoice_amount'])
customers.to_csv("data/customers.csv", index=False)
sales.to_csv("data/sales.csv", index=False)
invoices.to_csv("data/invoices.csv", index=False)

2. Feature Engineering

# src/data_prep.py
import pandas as pd, numpy as np

def _days_between(a, b):
    return (pd.to_datetime(b) - pd.to_datetime(a)).dt.days

def build_features(customers, sales, invoices, snapshot_date=None):
    snapshot_date = pd.Timestamp.today().normalize() if snapshot_date is None else pd.to_datetime(snapshot_date)
    sales['so_date'] = pd.to_datetime(sales['so_date'])
    invoices['invoice_date'] = pd.to_datetime(invoices['invoice_date'])
    invoices['due_date'] = pd.to_datetime(invoices['due_date'])
    invoices['payment_date'] = pd.to_datetime(invoices['payment_date'], errors='coerce')

    def rolling_sales(days):
        min_date = snapshot_date - pd.Timedelta(days=days)
        s = sales[sales['so_date'] >= min_date]
        g = s.groupby('customer_id').agg(
            sales_count=('so_date','count'),
            revenue=('net_value','sum'),
            avg_margin=('gross_margin_pct','mean')
        )
        g.columns = [f'{c}_{days}d' for c in g.columns]
        return g

    s12, s3 = rolling_sales(365), rolling_sales(90)
    inv = invoices.copy()
    inv['paid'] = ~inv['payment_date'].isna()
    inv['days_late'] = np.where(inv['paid'],
        _days_between(inv['due_date'], inv['payment_date']),
        _days_between(inv['due_date'], snapshot_date))
    inv['days_late'] = inv['days_late'].clip(lower=-60, upper=180)

    ginv = inv.groupby('customer_id').agg(
        invoices_total=('invoice_id','count'),
        invoices_unpaid=('paid', lambda x: (~x).sum()),
        late_ratio=('days_late', lambda s: (s>0).mean()),
        avg_days_late=('days_late','mean'),
        outstanding_amt=('invoice_amount', lambda a: a[inv.loc[a.index, 'paid']==False].sum())
    )

    feats = customers.set_index('customer_id').join([s12, s3, ginv])
    feats.fillna(0, inplace=True)
    feats = pd.get_dummies(feats, columns=['industry','region'], drop_first=True)

    feats['label_credit_event'] = ((feats['invoices_unpaid']>0)|(feats['avg_days_late']>10)).astype(int)
    feats['label_churn'] = ((feats['revenue_365d']>0)&(feats['revenue_90d']<feats['revenue_365d']*0.05)).astype(int)
    return feats.reset_index(), snapshot_date

3. Model Training

# train_model.py
import os, joblib, pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
from src.data_prep import build_features

customers=pd.read_csv('data/customers.csv')
sales=pd.read_csv('data/sales.csv')
invoices=pd.read_csv('data/invoices.csv')
feats,_=build_features(customers,sales,invoices)
os.makedirs('models',exist_ok=True)

def train_model(ycol,name):
    X=feats.drop(columns=['customer_id','label_credit_event','label_churn'],errors='ignore').fillna(0)
    y=feats[ycol]
    Xtr,Xte,ytr,yte=train_test_split(X,y,test_size=0.3,random_state=42,stratify=y)
    clf=RandomForestClassifier(n_estimators=300,class_weight='balanced',random_state=42)
    clf.fit(Xtr,ytr)
    print(f'{name} AUC:',roc_auc_score(yte,clf.predict_proba(Xte)[:,1]))
    joblib.dump({'model':clf,'columns':list(X.columns)},f'models/{name}.pkl')

train_model('label_credit_event','credit_risk')
train_model('label_churn','churn_risk')

4. Streamlit App (Dashboard + REST API)

# app.py
import os, pandas as pd, joblib, streamlit as st
from src.data_prep import build_features

st.title("AIS-CRF™ — Customer Credit & Churn Risk Forecaster")

data_dir='data'
customers=pd.read_csv(f'{data_dir}/customers.csv')
sales=pd.read_csv(f'{data_dir}/sales.csv')
invoices=pd.read_csv(f'{data_dir}/invoices.csv')

snapshot=st.date_input('Snapshot date',pd.Timestamp.today().date())
feats,_=build_features(customers,sales,invoices,snapshot)

models={'credit_risk':joblib.load('models/credit_risk.pkl'),
        'churn_risk':joblib.load('models/churn_risk.pkl')}
X=feats.drop(columns=['customer_id','label_credit_event','label_churn'],errors='ignore').fillna(0)
for name,pack in models.items():
    for c in pack['columns']:
        if c not in X.columns: X[c]=0
    X=X[pack['columns']]
    feats[f'{name}_score']=pack['model'].predict_proba(X)[:,1]

feats['recommended_action']=feats['credit_risk_score'].apply(
    lambda x: 'Require prepayment' if x>0.7 else 'Offer early-pay discount' if x>0.4 else 'Maintain normal terms')

st.dataframe(feats[['customer_id','credit_risk_score','churn_risk_score','recommended_action']].sort_values('credit_risk_score',ascending=False))
st.markdown("**Integrate this API with SAP Joule or ABAP via REST for predictive foresight.**")

ABAP Integration: Bringing It into SAP

Here’s how SAP calls the model through a REST API — directly from FD32, FD33, or a Z-report:

DATA: lv_url      TYPE string VALUE 'https://ais-crf.walfsun.com/api/predict',
      lv_json     TYPE string,
      lv_response TYPE string,
      lo_http     TYPE REF TO if_http_client.

lv_json = |{{ "customer_id": "{ p_kunnr }", "snapshot_date": "{ sy-datum }" }}|.

cl_http_client=>create_by_url(
  EXPORTING
    url    = lv_url
  IMPORTING
    client = lo_http
).

lo_http->request->set_header_field( name = 'Content-Type' value = 'application/json' ).
lo_http->request->set_method( 'POST' ).
lo_http->request->set_cdata( lv_json ).

lo_http->send( ).
lo_http->receive( ).

lv_response = lo_http->response->get_cdata( ).

DATA(lo_json) = /ui2/cl_json=>deserialize( json = lv_response ).
WRITE: / 'Credit Risk:', lo_json->get( 'credit_risk' ),
       / 'Churn Risk:',  lo_json->get( 'churn_risk' ),
       / 'Action:',      lo_json->get( 'recommended_action' ).

What it does:

Sends customer data to the Python API
Receives risk scores and recommended actions
Displays them in SAP — or stores in a Z-table (e.g., ZCUST_RISK_SCORE)

From here, Joule can pick up those same values as part of its natural-language responses.

Business Value

Challenge	AIS-CRF™ Solves
Late detection of risk	Predicts before default
Hidden churn	Detects inactivity patterns early
Manual FD32 monitoring	Automates risk scoring
Fragmented SD/CO/FI view	Unified predictive model

Finance gains control — not just insight.

Closing Thought

Joule describes. AIS-CRF™ predicts.Together, they give SAP both a voice and foresight.

Because real intelligence isn’t about describing data —It’s about anticipating what’s next.

Small Title