Auto-Detect Popular Candlestick Patterns with Python

Overview
Quickstart
Minimal working example
Steps to add this to your pipeline
Pattern rules used (concise)
Pitfalls and validation
Performance notes
Extending to more patterns
Tiny FAQ

Overview

This guide shows how to auto-detect popular candlestick patterns with Python and pandas. It focuses on practical, vectorized rules you can run on OHLC data for algorithmic trading.

Patterns covered:

Doji
Hammer and Shooting Star
Bullish/Bearish Engulfing
Morning Star (3-candle)

Quickstart

Install dependencies: pandas and numpy
Load OHLC data (open, high, low, close)
Compute body and wick metrics
Apply vectorized boolean rules for each pattern
Optionally filter by trend (e.g., a moving average)

Example setup:

# pip install pandas numpy
import pandas as pd
import numpy as np

# Example: load from CSV with columns: timestamp,open,high,low,close
# df = pd.read_csv("ohlc.csv", parse_dates=["timestamp"]).set_index("timestamp")

Minimal working example

Self-contained script that constructs a small OHLC set and detects the patterns.

import pandas as pd
import numpy as np

# Build a tiny OHLC sample with known patterns
rows = [
    #  o,     h,     l,     c   # comment
    (100.00, 101.00, 99.50, 100.02),  # 0 doji
    (100.10, 100.60, 99.90, 100.40),  # 1 small bull
    (102.00, 102.20, 101.20, 101.50), # 2 small bear
    (101.40, 103.50, 101.30, 103.20), # 3 bull engulfing vs 2
    (103.00, 103.20, 101.50, 103.10), # 4 hammer
    (103.20, 103.80, 102.80, 103.00), # 5 neutral
    (104.00, 104.20, 101.80, 102.00), # 6 long bear
    (101.90, 102.30, 101.50, 102.10), # 7 small body near lows
    (102.20, 104.70, 102.00, 104.50), # 8 strong bull -> morning star
    (104.80, 105.80, 104.10, 104.20), # 9 shooting star
]

df = pd.DataFrame(rows, columns=["open","high","low","close"]).astype(float)

# Derived metrics
rng = (df["high"] - df["low"]).replace(0, np.finfo(float).eps)
body = (df["close"] - df["open"]).abs()
upper_wick = df["high"] - df[["open","close"]].max(axis=1)
lower_wick = df[["open","close"]].min(axis=1) - df["low"]

# Core booleans
bull = df["close"] > df["open"]
bear = df["close"] < df["open"]

# 1) Doji: tiny body relative to range
DOJI_THR = 0.1
df["doji"] = (body / rng) <= DOJI_THR

# 2) Hammer: long lower wick, small upper wick, close in upper part
df["hammer"] = (
    (lower_wick >= 2 * body) &
    (upper_wick <= body) &
    ((df["close"] - df["low"]) / rng > 0.5)
)

# 3) Shooting star: mirror of hammer
df["shooting_star"] = (
    (upper_wick >= 2 * body) &
    (lower_wick <= body) &
    ((df["high"] - df["close"]) / rng > 0.5)
)

# 4) Engulfing (2-candle)
prev_open = df["open"].shift(1)
prev_close = df["close"].shift(1)
prev_body = (prev_close - prev_open).abs()

# Bullish engulfing: prev bear, curr bull, body engulfs prior body
df["bull_engulf"] = (
    (prev_close < prev_open) & bull &
    (df["open"] <= prev_close) & (df["close"] >= prev_open) &
    (body > prev_body)
)

# Bearish engulfing: prev bull, curr bear, body engulfs prior body
df["bear_engulf"] = (
    (prev_close > prev_open) & bear &
    (df["open"] >= prev_close) & (df["close"] <= prev_open) &
    (body > prev_body)
)

# 5) Morning star (3-candle): bear -> small -> strong bull closing above mid of day -2
prev2_open = df["open"].shift(2)
prev2_close = df["close"].shift(2)
prev1_open = df["open"].shift(1)
prev1_close = df["close"].shift(1)
prev1_body = (prev1_close - prev1_open).abs()
prev1_rng = (df["high"].shift(1) - df["low"].shift(1)).replace(0, np.finfo(float).eps)

small_prev1 = (prev1_body / prev1_rng) < 0.3
mid_prev2 = (prev2_open + prev2_close) / 2

df["morning_star"] = (
    (prev2_close < prev2_open) &  # day -2 is bearish
    small_prev1 &                 # day -1 small-bodied
    bull &                        # current is bullish
    (df["close"] > mid_prev2)     # close above midpoint of day -2
)

patterns = [
    "doji", "hammer", "shooting_star", "bull_engulf", "bear_engulf", "morning_star"
]

print(df[patterns])
print("\nDetections (rows with any pattern):")
print(df.loc[df[patterns].any(axis=1), ["open","high","low","close"] + patterns])

Expected: row 0 (doji), 3 (bull_engulf), 4 (hammer), 8 (morning_star), 9 (shooting_star) flag True.

Steps to add this to your pipeline

Load and clean OHLC

Ensure columns open, high, low, close are float.
Validate: high >= max(open, close), low <= min(open, close).
Drop duplicates and sort by timestamp.

Compute derived columns

Range, body, upper/lower wicks.
Use replace(0, eps) to prevent division-by-zero.

Implement pattern rules

Prefer vectorized boolean expressions over loops.
Parameterize thresholds (e.g., doji body/range <= 0.1).

Add context filters (optional)

Example: require a prior downtrend for bullish reversals.

ma = df["close"].rolling(20, min_periods=20).mean()
df["downtrend"] = df["close"] < ma
signals = df["hammer"] & df["downtrend"]

Aggregate and act

Combine signals into a column, forward-fill only if necessary.
Align with entry/exit rules to avoid look-ahead bias.

Pattern rules used (concise)

Let body = |close - open|, range = high - low.

Doji: body/range <= 0.1
Hammer: lower_wick >= 2*body, upper_wick <= body, close near upper half
Shooting star: upper_wick >= 2*body, lower_wick <= body, close near lower half
Bullish engulfing: prev bear, curr bull, curr body engulfs prev body
Bearish engulfing: prev bull, curr bear, curr body engulfs prev body
Morning star: bear (t-2) -> small (t-1) -> bull (t) closing above midpoint of (t-2)

Tune thresholds to your instrument and timeframe.

Pitfalls and validation

Data quality: Ensure high/low aren’t inverted; handle missing bars and outliers.
Adjusted data: Corporate actions can distort candles; use adjusted OHLC for equities.
Gaps: Some patterns assume gaps; in 24/7 markets they’re rare—adapt rules.
Look-ahead bias: Don’t use the same bar’s close for entry; act on next bar.
Timezone/holidays: Align sessions; avoid mixing RTH with after-hours unless intended.
Resampling: When building higher timeframes, use proper OHLC aggregation, not mean.
Parameter sensitivity: Small threshold tweaks can change signals; backtest ranges.

Performance notes

Vectorize: Use pandas/numpy boolean arrays; avoid Python loops and row-wise apply.
Memory: Use float32 where precision allows; drop intermediates after computing signals.
Chunking: For very large histories, process in chunks and write signals to disk.
Grouped runs: For multi-symbol datasets, groupby("symbol") and apply vectorized rules per group.
Parallelism: Parallelize across symbols or time windows with multiprocessing.
Alternatives: Libraries like TA-Lib or pandas_ta provide prebuilt pattern functions; benchmark them vs your vectorized rules.

Extending to more patterns

Harami (inside bar body): current body is contained within previous body.
Piercing/Dark Cloud Cover: two-candle penetrations of prior body.
Three White Soldiers/Black Crows: sequences of strong candles in the same direction.

Implement by combining shifts, body thresholds, and relative close conditions.